[
https://issues.apache.org/jira/browse/HIVE-27739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804321#comment-17804321
]
Denys Kuzmenko commented on HIVE-27739:
---------------------------------------
[~zratkai], is that a regression (present in 3.x)?
> Multiple issues with timestamps with timezone - can lead to data inconsistency
> ------------------------------------------------------------------------------
>
> Key: HIVE-27739
> URL: https://issues.apache.org/jira/browse/HIVE-27739
> Project: Hive
> Issue Type: Bug
> Affects Versions: 4.0.0-beta-1
> Reporter: Janos Kovacs
> Assignee: Zoltán Rátkai
> Priority: Major
>
> The following issues were found testing timestamps with timezonse:
> * CREATE TABLE fails with SemanticException when hive.local.time.zone is set
> to different valid value in the session
> * Invalid timezone values (e.g. with typo) treated as UTC which can lead to
> data consistency / loss issues
> * LOCAL is an invalid timezone value and treated as UTC instead of treating
> as system's timezone
> The issues are tracked as sub-tasks.
> in general, base tests are:
> {noformat}
> SELECT
> '\${system:user.timezone}' as os,
> '\${hiveconf:hive.local.time.zone}' as hive,
> 'TZ' as branch,
> tz as orig,
> to_utc_timestamp(tz,'\${hiveconf:hive.local.time.zone}') as to_utc,
>
> from_utc_timestamp(to_utc_timestamp(tz,'\${hiveconf:hive.local.time.zone}'),'Europe/Budapest')
> as to_bp,
>
> from_utc_timestamp(to_utc_timestamp(tz,'\${hiveconf:hive.local.time.zone}'),'America/Los_Angeles')
> as to_la
> FROM timestamptest;
> "
> {noformat}
>
> The results are:
> {noformat}
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | os | hive | branch | orig
> | to_utc | to_bp | to_la
> |
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | Europe/Budapest | Europe/Budapest | TZ | 2016-01-03 21:26:34.0
> Europe/Budapest | 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 |
> 2016-01-03 12:26:34.0 |
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | Europe/Budapest | UTC | TZ | 2016-01-03 20:26:34.0 UTC
> | 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 | 2016-01-03
> 12:26:34.0 |
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | Europe/Budapest | LOCAL | TZ | 2016-01-03 21:26:34.0
> Europe/Budapest | 2016-01-03 21:26:34.0 | 2016-01-03 22:26:34.0 |
> 2016-01-03 13:26:34.0 | !!!
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | UTC | Europe/Budapest | TZ | 2016-01-03 21:26:34.0
> Europe/Budapest | 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 |
> 2016-01-03 12:26:34.0 |
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | UTC | UTC | TZ | 2016-01-03 20:26:34.0 UTC
> | 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 | 2016-01-03
> 12:26:34.0 |
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> | UTC | LOCAL | TZ | 2016-01-03 20:26:34.0 UTC
> | 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 | 2016-01-03
> 12:26:34.0 | !!!
> +------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
> {noformat}
> The problematic cases:
> * the "Europe/Budapest | LOCAL" case is wrong, LOCAL is treated as UTC
> instead of system's TZ which makes 1h offset when converted
> * the "UTC | LOCAL" case is only good because LOCAL is treated as UTC all
> the time
> Repro code and more details are in each of the subtask tickets
--
This message was sent by Atlassian Jira
(v8.20.10#820010)