[
https://issues.apache.org/jira/browse/DRILL-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14935974#comment-14935974
]
Parth Chandra commented on DRILL-2908:
--------------------------------------
Doc looks good. The SQLType for INT96 is VARBINARY (12), BTW.
Vicky brought up an interesting caveat - The timestamp encoding used by
Hive/Impala is stored as a Parquet INT96 and read by Drill as a VARBINARY.
Converting a *Drill* timestamp to Varbinary is allowed by the CAST functions,
but the resultant VARBINARY is not the same as the int96.
Here's an example -
1) Create a Drill table after reading an int96 and converting to a timestamp -
create table t2(c1) as select CONVERT_FROM(created_ts, 'TIMESTAMP_IMPALA') from
t1 order by 1 limit 1;
Now t1.created_ts is int96 (or Hive/Impala timestamp) , t2.created_ts is a
Drill timestamp.
These two types are not comparable; i.e. we cannot use a condition like
t1.created_ts = t2.created_ts.
> Support reading the Parquet int 96 type
> ---------------------------------------
>
> Key: DRILL-2908
> URL: https://issues.apache.org/jira/browse/DRILL-2908
> Project: Apache Drill
> Issue Type: Improvement
> Components: Storage - Parquet
> Reporter: Jason Altekruse
> Assignee: Parth Chandra
> Labels: document
> Fix For: 1.2.0
>
>
> While Drill does not currently have an int96 type, it is supported by the
> parquet format and we should be able to read files that contain columns of
> this type. For now we will read the data into a varbinary and users will have
> to use existing convert_from functions or write their own to interpret the
> type of data actually stored. One example is the Impala timestamp format
> which is encoded in an int96 column.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)