Hi Folks Any suggestions or thoughts on the question / issue posted below ?
Regards Srinivas On 2018/09/19 10:47:38, Srinivas M <s...@gmail.com> wrote: > Hi> > > We have a java application which writes parquet files. We are using the> > Parquet 1.9.0 API to write the Timestamp data. Since there are> > incompatibilities between the Parquet and Hive representation of the> > Timestamp data, we have tried to work around the same by writing the> > Parquet Timestamp data as 12 byte array by converting the Timestamp fields> > in the format Hive expects. However, while setting the field type in the> > Schema, since Avro Schema Types does not have an enumeration for the INT96> > type, we have set it to bytes under the assumption that hive would allow> > reading the data since we have written in the format Hive expects. However,> > when we are trying to read the data from the Hive table, we are running> > into the following exception.> > > > *Question : *> > *---------------*> > *1. Is there any way we can work around this issue by making hive read the> > data when the timestamp field is set as bytes*> > *2. Is there any way in which the data type can be set as INT96 in the> > parquet schema ?*> > > Exception :> > ========> > Failed with exception> > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:> > java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be> > cast to org.apache.hadoop.hive.serde2.io.TimestampWritable> > ========> > > Schema of the file> > =============> > file schema: parquet.filecc> > --------------------------------------------------------------------------------> > C1: REQUIRED INT32 R:0 D:0> > C2: REQUIRED BINARY O:UTF8 R:0 D:0> > C3: REQUIRED BINARY O:UTF8 R:0 D:0> > *C4: REQUIRED BINARY R:0 D:0 ----> Timestamp> > Column*> > *C5: REQUIRED BINARY R:0 D:0 ----> Timestamp> > Column*> > > -----------------------------------------------------------------------------------------------------------> > > hive> show create table HiveParquetTimestamp;> > OK> > CREATE EXTERNAL TABLE `HiveParquetTimestamp`(> > `c1` int,> > `c2` char(4),> > `c3` varchar(8),> > `c4` timestamp,> > `c5` timestamp)> > ROW FORMAT SERDE> > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'> > STORED AS INPUTFORMAT> > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'> > OUTPUTFORMAT> > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'> > LOCATION> > 'hdfs://cdhkrb123.fyre.com:8020/tmp/HiveParquetTimestamp'> > > -- > > Srinivas> > (*-*)> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> > You have to grow from the inside out. None can teach you, none can make you> > spiritual.> > -Narendra Nath Dutta(Swamy Vivekananda)> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> >