[
https://issues.apache.org/jira/browse/HIVE-21002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772078#comment-16772078
]
Zoltan Ivanfi commented on HIVE-21002:
--------------------------------------
As we discussed on the Hive mailing list, I modified the sub-tasks of this JIRA
to reflect the new solution we agreed upon: The historical (backwards- and
forwards-compatible) way of handling timestamps should be restored while
keeping the new semantics at the same time. The details can be read in
descriptions of the sub-tasks.
> Backwards incompatible change: Hive 3.1 reads back Avro and Parquet
> timestamps written by Hive 2.x incorrectly
> --------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-21002
> URL: https://issues.apache.org/jira/browse/HIVE-21002
> Project: Hive
> Issue Type: Bug
> Affects Versions: 3.1.0, 3.1.1
> Reporter: Zoltan Ivanfi
> Priority: Major
>
> Hive 3.1 reads back Avro and Parquet timestamps written by Hive 2.x
> incorrectly. As an example session to demonstrate this problem, create a
> dataset using Hive version 2.x in America/Los_Angeles:
> {code:sql}
> hive> create table ts_‹format› (ts timestamp) stored as ‹format›;
> hive> insert into ts_‹format› values (*‘2018-01-01 00:00:00.000’*);
> {code}
> Querying this table by issuing
> {code:sql}
> hive> select * from ts_‹format›;
> {code}
> from different time zones using different versions of Hive and different
> storage formats gives the following results:
> |‹format›|Writer time zone (in Hive 2.x)|Reader time zone|Result in Hive 2.x
> reader|Result in Hive 3.1 reader|
> |Avro and Parquet|America/Los_Angeles|America/Los_Angeles|2018-01-01
> *00*:00:00.0|2018-01-01 *08*:00:00.0|
> |Avro and Parquet|America/Los_Angeles|Europe/Paris|2018-01-01
> *09*:00:00.0|2018-01-01 *08*:00:00.0|
> |Textfile and ORC|America/Los_Angeles|America/Los_Angeles|2018-01-01
> 00:00:00.0|2018-01-01 00:00:00.0|
> |Textfile and ORC|America/Los_Angeles|Europe/Paris|2018-01-01
> 00:00:00.0|2018-01-01 00:00:00.0|
> *Hive 3.1 clearly gives different results than Hive 2.x for timestamps stored
> in Avro and Parquet formats.* Apache ORC behaviour has not changed because it
> was modified to adjust timestamps to retain backwards compatibility. Textfile
> behaviour has not changed, because its processing involves parsing and
> formatting instead of proper serializing and deserializing, so they
> inherently had LocalDateTime semantics even in Hive 2.x.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)