[
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15626338#comment-15626338
]
ASF GitHub Bot commented on DRILL-4373:
---------------------------------------
Github user vdiravka commented on the issue:
https://github.com/apache/drill/pull/600
@parthchandra The known issue with hive that it stores timestamp values
into parquet files with local zone retain. That's why when we want to retrieve
the data from such table we should consider the local timezone.
On the other hand parquet files don't involve the particular time zone and
when we just read the file we shouldn't consdier a local timezone. And this is
also standard drill behaviour with normal int64 timestamps.
So I decided that we need two `IMPALA_TIMESTAMP` functions: for hive and
for regular parquet files.
I left `IMPALA_TIMESTAMP` function without local timezone retain and I
added `IMPALA_TIMESTAMP_LOCALTIMEZONE` function (implicit using with hive
timestamps and enabled drill native parquet reader).
Please let me know if this approach is good.
Changes in a new commit for easy review.
> Drill and Hive have incompatible timestamp representations in parquet
> ---------------------------------------------------------------------
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
> Issue Type: Improvement
> Components: Storage - Hive, Storage - Parquet
> Affects Versions: 1.8.0
> Reporter: Rahul Challapalli
> Assignee: Parth Chandra
> Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a
> hive table on top of the parquet file and use "timestamp" as the column type,
> drill fails to read the hive table through the hive storage plugin
> Implementation:
> Added int96 to timestamp converter for both parquet readers and controling it
> by system / session option "store.parquet.int96_as_timestamp".
> The value of the option is false by default for the proper work of the old
> query scripts with the "convert_from TIMESTAMP_IMPALA" function.
> When the option is true using of that function is unnesessary and can lead to
> the query fail.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)