Re: Issue while reading Parquet file in Hive

Daniel Weeks Fri, 17 Jul 2015 08:40:15 -0700

Santial,

It might just be as simple as the storage format for your hive table.  I
notice you say:


hive> create table timestampTest (timestampField timestamp);

But this should be:

hive> create table timestampTest (timestampField timestamp) stored as
parquet;

Hive is probably processing the file as text.  Please do a 'hive> desc
formatted timestampTest;' and verify the input/output/serde for the table
is actually parquet.

-Dan

On Thu, Jul 16, 2015 at 11:54 PM, Santlal J Gupta <
[email protected]> wrote:

> Hello,
>
> I have following issue.
>
> I have created parquet file through cascading parquet  and want to load
> into the hive table. Parquet file is loaded successfully but when I try to
> read the file it  gives null instead of actual data. Please find the below
> code .
>
> package com.parquet.TimestampTest;
>
> import cascading.flow.FlowDef;
> import cascading.flow.hadoop.HadoopFlowConnector;
> import cascading.pipe.Pipe;
> import cascading.scheme.Scheme;
> import cascading.scheme.hadoop.TextDelimited;
> import cascading.tap.SinkMode;
> import cascading.tap.Tap;
> import cascading.tap.hadoop.Hfs;
> import cascading.tuple.Fields;
> import parquet.cascading.ParquetTupleScheme;
>
> public class GenrateTimeStampParquetFile {
>      static String inputPath = "target/input/timestampInputFile";
>      static String outputPath = "target/parquetOutput/TimestampOutput";
>
>      public static void main(String[] args) {
>
>            write();
>      }
>
>      private static void write() {
>            // TODO Auto-generated method stub
>
>            Fields field = new
> Fields("timestampField").applyTypes(String.class);
>            Scheme sourceSch = new TextDelimited(field, true, "\n");
>
>            Fields outputField = new Fields("timestampField");
>
>            Scheme sinkSch = new ParquetTupleScheme(field, outputField,
>                      "message TimeStampTest{optional binary timestampField
> ;}");
>
>            Tap source = new Hfs(sourceSch, inputPath);
>            Tap sink = new Hfs(sinkSch, outputPath, SinkMode.REPLACE);
>
>            Pipe pipe = new Pipe("Hive timestamp");
>
>            FlowDef fd = FlowDef.flowDef().addSource(pipe,
> source).addTailSink(pipe, sink);
>
>            new HadoopFlowConnector().connect(fd).complete();
>      }
> }
>
> Input file:
>
> timestampInputFile
>
> timestampField
> 1988-05-25 15:15:15.254
> 1987-05-06 14:14:25.362
>
> After running the code following files are generated.
> Output :
> 1. part-00000-m-00000.parquet
> 2. _SUCCESS
> 3. _metadata
> 4. _common_metadata
>
> I have created the table in hive to load the part-00000-m-00000.parquet
> file.
> File is loaded is successfully but it gives null value while reading.
>
> I have used following command.
>
> hive> create table timestampTest (timestampField timestamp);
>
> hive> load data local inpath
> '/home/hduser/parquet_testing/part-00000-m-00000.parquet' into table
> timestampTest;
> Loading data to table parquet_timestamp_test.timestamptest
> Table parquet_timestamp_test.timestamptest stats: [numFiles=1,
> totalSize=296]
> OK
> Time taken: 0.508 seconds
>
> hive> select * from timestamptest;
> OK
> NULL
> NULL
> NULL
> Time taken: 0.104 seconds, Fetched: 3 row(s)
>
> **************************************Disclaimer******************************************
> This e-mail message and any attachments may contain confidential
> information and is for the sole use of the intended recipient(s) only. Any
> views or opinions presented or implied are solely those of the author and
> do not necessarily represent the views of BitWise. If you are not the
> intended recipient(s), you are hereby notified that disclosure, printing,
> copying, forwarding, distribution, or the taking of any action whatsoever
> in reliance on the contents of this electronic information is strictly
> prohibited. If you have received this e-mail message in error, please
> immediately notify the sender and delete the electronic message and any
> attachments.BitWise does not accept liability for any virus introduced by
> this e-mail or any attachments.
> ********************************************************************************************
>

Re: Issue while reading Parquet file in Hive

Reply via email to