[
https://issues.apache.org/jira/browse/HIVE-20980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740613#comment-16740613
]
Jesus Camacho Rodriguez commented on HIVE-20980:
------------------------------------------------
[~klcopp], thanks for your patch. I do not think we have reached an agreement
on how to move forward concerning timestamp types (at least in Hive). What I do
not like about the current proposal is that we still have different semantics
(local date time vs instant) for the same type (timestamp) depending on the
storage format used for the table (e.g text/orc vs parquet/avro). This seems
quite shaky moving forward. In turn, I already described my concerns about
having 4 types where 'timestamp without time zone' does not have same semantics
as 'timestamp'.
The patch is reimplementing 'timestamp with local time zone' semantics into a
'timestamp' and it even relies on the session time zone instead of relying on
the system time zone, which is something that was not done before AFAIK.
Instead of following that path, is it possible to provide an upgrade path from
< 3.x to 3.x where the column type for tables stored using Parquet is altered
when we upgrade ('timestamp' -> 'timestamp with local time zone')? Then Parquet
writer/reader can choose how to store the 'timestamp with local time zone' type
internally, e.g., if it wants to remain compatible with legacy readers, it
could choose to store it as a timestamp. That would provide consistent
semantics moving forward as well as backwards compatibility, albeit DDL
statements created before version 3.x will need to be modified if instant
semantics are required. Is that reasonable? Is there anything I am missing?
Cc [~zi] [~owen.omalley]
> Reinstate Parquet timestamp conversion between HS2 time zone and UTC
> --------------------------------------------------------------------
>
> Key: HIVE-20980
> URL: https://issues.apache.org/jira/browse/HIVE-20980
> Project: Hive
> Issue Type: Sub-task
> Components: File Formats
> Reporter: Karen Coppage
> Assignee: Karen Coppage
> Priority: Major
> Attachments: HIVE-20980.1.patch, HIVE-20980.2.patch,
> HIVE-20980.2.patch
>
>
> With HIVE-20007, Parquet timestamps became timezone-agnostic. This means that
> timestamps written after the change are read exactly as they were written;
> but timestamps stored before this change are effectively converted from the
> writing HS2 server time zone to GMT time zone. This patch reinstates the
> original behavior: timestamps are converted to UTC before write and from UTC
> before read.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)