[jira] [Commented] (HIVE-22224) Support Parquet-Avro Timestamp Type

Brandon Scheller (Jira) Thu, 07 Nov 2019 17:45:46 -0800


    [ 
https://issues.apache.org/jira/browse/HIVE-22224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969723#comment-16969723
 ]


Brandon Scheller commented on HIVE-22224:
-----------------------------------------

[~chenxiang] further investigation has helped me find that the above issue 
should not be a blocking problem.

This is because that is the spark-avro standard: 
[https://github.com/apache/spark/blob/master/external/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala#L151]

Writes data in timestamp_micros. This means that if we see a "longWritable" for 
a timestamp, we can assume it is from spark-avro and therefore parse it into a 
timestamp assuming it is in timestamp_micro form.

This can act as a current solution until Avro is upgraded and we can use the 
actual Avro logical types.

> Support Parquet-Avro Timestamp Type
> -----------------------------------
>
>                 Key: HIVE-22224
>                 URL: https://issues.apache.org/jira/browse/HIVE-22224
>             Project: Hive
>          Issue Type: Bug
>          Components: Database/Schema
>    Affects Versions: 2.3.5, 2.3.6
>            Reporter: cdmikechen
>            Assignee: cdmikechen
>            Priority: Major
>              Labels: parquet
>             Fix For: 2.3.7
>
>
> When user create an external table and import a parquet-avro data with 1.8.2 
> version which supported logical_type in Hive2.3 or before version, Hive can 
> not read timestamp type column data correctly.
> Hive will read it as LongWritable which it actually stores as 
> long(logical_type=timestamp-millis).So we may add some codes in 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableTimestampObjectInspector.java
>  to let Hive cast long type to timestamp type.
> Some code like below:
>  
> public Timestamp getPrimitiveJavaObject(Object o) {
>   if (o instanceof LongWritable) {
>     return new Timestamp(((LongWritable) o).get());
>   }
>   return o == null ? null : ((TimestampWritable) o).getTimestamp();
> }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22224) Support Parquet-Avro Timestamp Type

Reply via email to