Janos Kovacs created HIVE-27739:
-----------------------------------
Summary: Multiple issues with timestamps with timezone
Key: HIVE-27739
URL: https://issues.apache.org/jira/browse/HIVE-27739
Project: Hive
Issue Type: Bug
Affects Versions: 4.0.0-beta-1
Reporter: Janos Kovacs
Assignee: Zoltán Rátkai
The following issues were found testing timestamps with timezonse:
* CREATE TABLE fails with SemanticException when hive.local.time.zone is set
to different valid value in the session
* Invalid timezone values (e.g. with typo) treated as UTC which can lead to
data consistency / loss issues
* LOCAL is an invalid timezone value and treated as UTC instead of treating as
system's timezone
The issues are tracked as sub-tasks.
in general, base tests are:
{noformat}
SELECT
'\${system:user.timezone}' as os,
'\${hiveconf:hive.local.time.zone}' as hive,
'TZ' as branch,
tz as orig,
to_utc_timestamp(tz,'\${hiveconf:hive.local.time.zone}') as to_utc,
from_utc_timestamp(to_utc_timestamp(tz,'\${hiveconf:hive.local.time.zone}'),'Europe/Budapest')
as to_bp,
from_utc_timestamp(to_utc_timestamp(tz,'\${hiveconf:hive.local.time.zone}'),'America/Los_Angeles')
as to_la
FROM timestamptest;
"
{noformat}
The results are:
{noformat}
+------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
| os | hive | branch | orig
| to_utc | to_bp | to_la
|
+------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
| Europe/Budapest | Europe/Budapest | TZ | 2016-01-03 21:26:34.0
Europe/Budapest | 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 | 2016-01-03
12:26:34.0 |
+------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
| Europe/Budapest | UTC | TZ | 2016-01-03 20:26:34.0 UTC
| 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 | 2016-01-03
12:26:34.0 |
+------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
| Europe/Budapest | LOCAL | TZ | 2016-01-03 21:26:34.0
Europe/Budapest | 2016-01-03 21:26:34.0 | 2016-01-03 22:26:34.0 | 2016-01-03
13:26:34.0 | !!!
+------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
| UTC | Europe/Budapest | TZ | 2016-01-03 21:26:34.0
Europe/Budapest | 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 | 2016-01-03
12:26:34.0 |
+------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
| UTC | UTC | TZ | 2016-01-03 20:26:34.0 UTC
| 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 | 2016-01-03
12:26:34.0 |
+------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
| UTC | LOCAL | TZ | 2016-01-03 20:26:34.0 UTC
| 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 | 2016-01-03
12:26:34.0 | !!!
+------------------+------------------+---------+----------------------------------------+------------------------+------------------------+------------------------+
{noformat}
The problematic cases:
* the "Europe/Budapest | LOCAL" case is wrong, LOCAL is treated as UTC instead
of system's TZ which makes 1h offset when converted
* the "UTC | LOCAL" case is only good because LOCAL is treated as UTC all the
time
Repro code and more details are in each of the subtask tickets
--
This message was sent by Atlassian Jira
(v8.20.10#820010)