[
https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671191#comment-15671191
]
Vitalii Diravka commented on HIVE-9482:
---------------------------------------
Why this hive.parquet.timestamp.skip.conversion option is enabled by default?
Since according [parquet
spec|https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md#timestamp_millis],
parquet files don't keep local timezone. And we cann't distinguish from file
what was the value of that option while parquet file was generating.
> Hive parquet timestamp compatibility
> ------------------------------------
>
> Key: HIVE-9482
> URL: https://issues.apache.org/jira/browse/HIVE-9482
> Project: Hive
> Issue Type: Bug
> Components: File Formats
> Affects Versions: 0.15.0
> Reporter: Szehon Ho
> Assignee: Szehon Ho
> Fix For: 1.2.0
>
> Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch,
> parquet_external_time.parq
>
>
> In current Hive implementation, timestamps are stored in UTC (converted from
> current timezone), based on original parquet timestamp spec.
> However, we find this is not compatibility with other tools, and after some
> investigation it is not the way of the other file formats, or even some
> databases (Hive Timestamp is more equivalent of 'timestamp without timezone'
> datatype).
> This is the first part of the fix, which will restore compatibility with
> parquet-timestamp files generated by external tools by skipping conversion on
> reading.
> Later fix will change the write path to not convert, and stop the
> read-conversion even for files written by Hive itself.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)