[
https://issues.apache.org/jira/browse/HIVE-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545184#comment-13545184
]
Ryan Harris commented on HIVE-3822:
-----------------------------------
I don't have enough experience with this operation on other database systems to
have clear expectations of how things should work in Hive.
I initially noticed the inconsistency when using to_utc_timestamp() and then
identified that the behavior seemed to come from the cast() conversion
If you are saying that the behavior of cast() is expected and appropriate, then
it would seem that to_utc_timestamp() is producing unexpected results....
However, to_utc_timestamp(timestamp,string) expects a timestamp (not
integer/float) value...thus cast() is being used to get the int/float
submitted, INTO a timestamp, which is why
select from_utc_timestamp(to_utc_timestamp(0.0,'GMT'),'GMT') from test limit 1
and
select from_utc_timestamp(to_utc_timestamp(cast(0.0 as timestamp),'GMT'),'GMT')
from test limit 1
both return the same (wrong) result:
1969-12-31 17:00:00
Here is my question....
If you are processing a log file that contains a GMT epoch formatted date
value, what is the appropriate way to process/store that value in a timestamp
field and get the same stored timestamp regardless of the timezone setting of
the cluster processing the file?
What should be changed to ensure that:
select from_utc_timestamp(to_utc_timestamp(0.0,'GMT'),'GMT') from test limit 1
ALWAYS returns 1970-01-01 00:00:00 as it should, no matter what the system
timezone is set for?
> Casting from double to timestamp type seems inconsistent
> --------------------------------------------------------
>
> Key: HIVE-3822
> URL: https://issues.apache.org/jira/browse/HIVE-3822
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.9.0
> Reporter: Mark Grover
> Assignee: Mark Grover
>
> {code}
> select cast(1355944339 as timestamp) from decimal_3 limit 1;
> 1970-01-16 08:39:04.339
> select cast(1355944339000 as timestamp) from decimal_3 limit 1;
> 2012-12-19 11:12:19
> select cast(1355944339.123456789 as timestamp) from decimal_3 limit 1;
> 2012-12-19 11:12:19.1234567
> {code}
> If specifying the unixTimestamp without a decimal point, we need to specify
> the millisecond timestamp. If specifying with a decimal point, we need to
> specify only the second timestamp and the rest goes after decimal. Moreover,
> it seems like some precision is lost (notice '89' are lost in the last query.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira