[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425899#comment-16425899
 ] 

Haozhun Jin commented on HIVE-12192:
------------------------------------

[~jcamachorodriguez], I work with [~findepi]. We are trying to understand the 
current status and vision of Hive project on Hive types. Thank you for helping 
us.

Below I summarize my understanding after reading HIVE-12192 and HIVE-16614. The 
table describes what each type means using the semantically equivalent type in 
java.time. (Instant here is a bit more general in that you are allowed to get 
year/month/day/hour/minute fields from it. But this doesn't change its 
fundamental meaning.)
||Hive Type||Legacy Hive||Before HIVE-16614||After HIVE-16614||After 
HIVE-12192||Eventually||
|Timestamp|Instant|Instant|Instant|LocalDateTime|LocalDateTime|
|Timestamp w local tz|(not present)|(not present)|Instant|Instant|Instant|
|Timestamp w tz|(not present)|Instant|(not present)|(not present)|ZonedDateTime|

Is this understanding correct?

If my understanding is correct, what is the difference between "Timestamp" and 
"Timestamp w local tz" before HIVE-12192 (except maybe the constructor)?

> Hive should carry out timestamp computations in UTC
> ---------------------------------------------------
>
>                 Key: HIVE-12192
>                 URL: https://issues.apache.org/jira/browse/HIVE-12192
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Hive
>            Reporter: Ryan Blue
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Major
>              Labels: timestamp
>         Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to