Hi Daniel,

I am beginner in the cascading parquet.
As per your guidline I have created table  as by following command.

hive> create table test3(timestampField timestamp) stored as parquet;
hive> load data local inpath 
'/home/hduser/parquet_testing/part-00000-m-00000.parquet' into table test3;
hive> select  * from test3;

After running above command I got following as output.

Output : 

OK
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
Failed with exception 
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast 
to org.apache.hadoop.hive.serde2.io.TimestampWritable

Actually i want to create the parquet file through cascading parquet and load 
into the hive. 
And my parquet file contain data, that I want to store into hive column of type 
timestamp.
But in the cascading parquet there is no timestamp datatype, so for that reason 
 I have taken as Binary. And trying to load into table that have field of type 
timestamp. 
But I have got above exception.

So please help me to solve this problem.

Currently I am using 
    Hive 1.1.0-cdh5.4.2.
   Cascading 2.5.1
   parquet-format-2.2.0


Thanks,
Santlal Gutpa


-----Original Message-----
From: Daniel Weeks [mailto:[email protected]] 
Sent: Friday, July 17, 2015 9:09 PM
To: [email protected]
Subject: Re: Issue while reading Parquet file in Hive

Santial,

It might just be as simple as the storage format for your hive table.  I notice 
you say:

hive> create table timestampTest (timestampField timestamp);

But this should be:

hive> create table timestampTest (timestampField timestamp) stored as
parquet;

Hive is probably processing the file as text.  Please do a 'hive> desc 
formatted timestampTest;' and verify the input/output/serde for the table is 
actually parquet.

-Dan

On Thu, Jul 16, 2015 at 11:54 PM, Santlal J Gupta < 
[email protected]> wrote:

> Hello,
>
> I have following issue.
>
> I have created parquet file through cascading parquet  and want to 
> load into the hive table. Parquet file is loaded successfully but when 
> I try to read the file it  gives null instead of actual data. Please 
> find the below code .
>
> package com.parquet.TimestampTest;
>
> import cascading.flow.FlowDef;
> import cascading.flow.hadoop.HadoopFlowConnector;
> import cascading.pipe.Pipe;
> import cascading.scheme.Scheme;
> import cascading.scheme.hadoop.TextDelimited;
> import cascading.tap.SinkMode;
> import cascading.tap.Tap;
> import cascading.tap.hadoop.Hfs;
> import cascading.tuple.Fields;
> import parquet.cascading.ParquetTupleScheme;
>
> public class GenrateTimeStampParquetFile {
>      static String inputPath = "target/input/timestampInputFile";
>      static String outputPath = 
> "target/parquetOutput/TimestampOutput";
>
>      public static void main(String[] args) {
>
>            write();
>      }
>
>      private static void write() {
>            // TODO Auto-generated method stub
>
>            Fields field = new
> Fields("timestampField").applyTypes(String.class);
>            Scheme sourceSch = new TextDelimited(field, true, "\n");
>
>            Fields outputField = new Fields("timestampField");
>
>            Scheme sinkSch = new ParquetTupleScheme(field, outputField,
>                      "message TimeStampTest{optional binary 
> timestampField ;}");
>
>            Tap source = new Hfs(sourceSch, inputPath);
>            Tap sink = new Hfs(sinkSch, outputPath, SinkMode.REPLACE);
>
>            Pipe pipe = new Pipe("Hive timestamp");
>
>            FlowDef fd = FlowDef.flowDef().addSource(pipe, 
> source).addTailSink(pipe, sink);
>
>            new HadoopFlowConnector().connect(fd).complete();
>      }
> }
>
> Input file:
>
> timestampInputFile
>
> timestampField
> 1988-05-25 15:15:15.254
> 1987-05-06 14:14:25.362
>
> After running the code following files are generated.
> Output :
> 1. part-00000-m-00000.parquet
> 2. _SUCCESS
> 3. _metadata
> 4. _common_metadata
>
> I have created the table in hive to load the 
> part-00000-m-00000.parquet file.
> File is loaded is successfully but it gives null value while reading.
>
> I have used following command.
>
> hive> create table timestampTest (timestampField timestamp);
>
> hive> load data local inpath
> '/home/hduser/parquet_testing/part-00000-m-00000.parquet' into table 
> timestampTest; Loading data to table 
> parquet_timestamp_test.timestamptest
> Table parquet_timestamp_test.timestamptest stats: [numFiles=1, 
> totalSize=296] OK Time taken: 0.508 seconds
>
> hive> select * from timestamptest;
> OK
> NULL
> NULL
> NULL
> Time taken: 0.104 seconds, Fetched: 3 row(s)
>
> **************************************Disclaimer**********************
> ******************** This e-mail message and any attachments may 
> contain confidential information and is for the sole use of the 
> intended recipient(s) only. Any views or opinions presented or implied 
> are solely those of the author and do not necessarily represent the 
> views of BitWise. If you are not the intended recipient(s), you are 
> hereby notified that disclosure, printing, copying, forwarding, 
> distribution, or the taking of any action whatsoever in reliance on 
> the contents of this electronic information is strictly prohibited. If 
> you have received this e-mail message in error, please immediately 
> notify the sender and delete the electronic message and any 
> attachments.BitWise does not accept liability for any virus introduced 
> by this e-mail or any attachments.
> **********************************************************************
> **********************
>

Reply via email to