[jira] [Commented] (HIVE-18323) Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader for parquet

Vihang Karajgaonkar (JIRA) Mon, 08 Jan 2018 18:21:25 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-18323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317556#comment-16317556
 ]


Vihang Karajgaonkar commented on HIVE-18323:
--------------------------------------------

Are timestamps are serialized as binary instead of longs based on 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L499
 ?

If yes, then I think we should readBinary instead of reading longs. In case we 
should use 
{{NanoTimeUtils.getTimestamp(NanoTime.fromBinary(dataColumn.readBytes()), 
false)}} to get the timestamps deserialized from the binary values.

> Vectorization: add the support of timestamp in 
> VectorizedPrimitiveColumnReader for parquet
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-18323
>                 URL: https://issues.apache.org/jira/browse/HIVE-18323
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Vectorization
>    Affects Versions: 3.0.0
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>         Attachments: HIVE-18323.1.patch
>
>
> {noformat}
> CREATE TABLE `t1`(
>   `ts` timestamp,
>   `s1` string)
> STORED AS PARQUET;
> set hive.vectorized.execution.enabled=true;
> SELECT * from t1 SORT BY s1;
> {noformat}
> This query will throw exception since timestamp is not supported here yet.
> {noformat}
> Caused by: java.io.IOException: java.io.IOException: Unsupported type: 
> optional int96 ts
>         at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>         at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18323) Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader for parquet

Reply via email to