[ 
https://issues.apache.org/jira/browse/DRILL-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14935974#comment-14935974
 ] 

Parth Chandra commented on DRILL-2908:
--------------------------------------

Doc looks good. The SQLType for INT96 is VARBINARY (12), BTW.   
Vicky brought up an interesting caveat - The timestamp encoding used by 
Hive/Impala is stored as a Parquet INT96 and read by Drill as a VARBINARY. 
Converting a *Drill* timestamp to Varbinary is allowed by the CAST functions, 
but the resultant VARBINARY is not the same as the int96.

Here's an example - 
1) Create a Drill table after reading an int96 and converting to a timestamp - 

create table t2(c1) as select CONVERT_FROM(created_ts, 'TIMESTAMP_IMPALA') from 
t1 order by 1 limit 1;

Now t1.created_ts is int96 (or Hive/Impala timestamp) , t2.created_ts is a 
Drill timestamp.

These two types are not comparable; i.e. we cannot use a condition like 
t1.created_ts = t2.created_ts.






> Support reading the Parquet int 96 type
> ---------------------------------------
>
>                 Key: DRILL-2908
>                 URL: https://issues.apache.org/jira/browse/DRILL-2908
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Parquet
>            Reporter: Jason Altekruse
>            Assignee: Parth Chandra
>              Labels: document
>             Fix For: 1.2.0
>
>
> While Drill does not currently have an int96 type, it is supported by the 
> parquet format and we should be able to read files that contain columns of 
> this type. For now we will read the data into a varbinary and users will have 
> to use existing convert_from functions or write their own to interpret the 
> type of data actually stored. One example is the Impala timestamp format 
> which is encoded in an int96 column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to