Janos Kovacs created HIVE-27742:
-----------------------------------
Summary: LOCAL timezone value is treated as UTC instead of
system's timezone which causes data consistency issues
Key: HIVE-27742
URL: https://issues.apache.org/jira/browse/HIVE-27742
Project: Hive
Issue Type: Sub-task
Affects Versions: 4.0.0-beta-1
Reporter: Janos Kovacs
Assignee: Zoltán Rátkai
The Hive configuration states:
{noformat}
HIVE_LOCAL_TIME_ZONE("hive.local.time.zone", "LOCAL",
"Sets the time-zone for displaying and interpreting time stamps. If
this property value is set to\n" +
"LOCAL, it is not specified, or it is not a correct time-zone, the
system default time-zone will be\n " +
"used instead. Time-zone IDs can be specified as region-based zone IDs
(based on IANA time-zone data),\n" +
"abbreviated zone IDs, or offset IDs."),
{noformat}
But seems like in hive4 (-beta) it always treated as UTC - as any other invalid
timezone value (see HIVE-27741).
Repro code:
{noformat}
docker rm -f hive4
export HIVE_VERSION=4.0.0-beta-2-SNAPSHOT
export HS2_ENV_TZ="Europe/Budapest"
export HS2_USER_TZ=${HS2_ENV_TZ}
export HIVE_LOCAL_TZ=${HS2_ENV_TZ}
export HS2_OPTS="-Duser.timezone=$HS2_USER_TZ
-Dhive.local.time.zone=$HIVE_LOCAL_TZ"
export HS2_OPTS="$HS2_OPTS
-Dhive.server2.tez.initialize.default.sessions=false"
docker run -d -p 10000:10000 -p 10001:10001 -p 10002:10002 --env
TZ=${HS2_ENV_TZ} --env SERVICE_OPTS=${HS2_OPTS} --env SERVICE_NAME=hiveserver2
--name hive4 apache/hive:${HIVE_VERSION}
docker exec -it hive4 beeline -u 'jdbc:hive2://localhost:10000/' -e "
SELECT '\${env:TZ}' as \`env:TZ\`,
'\${system:user.timezone}' as \`system:user.timezone\`,
'\${hiveconf:hive.local.time.zone}' as \`hiveconf:hive.local.time.zone\`;
DROP TABLE IF EXISTS timestamptest;
CREATE TABLE timestamptest (
ts timestamp,
tz timestamp with local time zone
) STORED AS TEXTFILE;
INSERT INTO timestamptest select TIMESTAMP'2016-01-03
12:26:34',TIMESTAMPLOCALTZ'2016-01-03 12:26:34 America/Los_Angeles';
SET hive.query.results.cache.enabled=false;
SET hive.local.time.zone=LOCAL;
SELECT '\${env:TZ}' as \`env:TZ\`,
'\${system:user.timezone}' as \`system:user.timezone\`,
'\${hiveconf:hive.local.time.zone}' as \`hiveconf:hive.local.time.zone\`;
SELECT
'LOCAL' as tzset,
tz as orig,
to_utc_timestamp(tz, 'LOCAL') as
utc_local,
to_utc_timestamp(tz, 'Europe/Budapest') as utc_tz,
from_utc_timestamp(to_utc_timestamp(tz,'LOCAL'),'Europe/Budapest') as to_bp
FROM timestamptest;
SET hive.local.time.zone=Europe/Budapest;
SELECT '\${env:TZ}' as \`env:TZ\`,
'\${system:user.timezone}' as \`system:user.timezone\`,
'\${hiveconf:hive.local.time.zone}' as \`hiveconf:hive.local.time.zone\`;
SELECT
'Europe/Budapest'
as tzset,
tz
as orig,
to_utc_timestamp(tz, 'LOCAL')
as utc_local,
to_utc_timestamp(tz, 'Europe/Budapest')
as utc_tz,
from_utc_timestamp(to_utc_timestamp(tz,'Europe/Budapest'),'Europe/Budapest')
as to_bp
FROM timestamptest;
"
{noformat}
The results are:
{noformat}
+------------------+-----------------------+--------------------------------+
| env:tz | system:user.timezone | hiveconf:hive.local.time.zone |
+------------------+-----------------------+--------------------------------+
| Europe/Budapest | Europe/Budapest | LOCAL |
+------------------+-----------------------+--------------------------------+
+--------+----------------------------------------+------------------------+------------------------+------------------------+
| tzset | orig | utc_local |
utc_tz | to_bp |
+--------+----------------------------------------+------------------------+------------------------+------------------------+
| LOCAL | 2016-01-03 21:26:34.0 Europe/Budapest | 2016-01-03 21:26:34.0 |
2016-01-03 20:26:34.0 | 2016-01-03 22:26:34.0 |
+--------+----------------------------------------+------------------------+------------------------+------------------------+
+------------------+-----------------------+--------------------------------+
| env:tz | system:user.timezone | hiveconf:hive.local.time.zone |
+------------------+-----------------------+--------------------------------+
| Europe/Budapest | Europe/Budapest | Europe/Budapest |
+------------------+-----------------------+--------------------------------+
+------------------+----------------------------------------+------------------------+------------------------+------------------------+
| tzset | orig | utc_local
| utc_tz | to_bp |
+------------------+----------------------------------------+------------------------+------------------------+------------------------+
| Europe/Budapest | 2016-01-03 21:26:34.0 Europe/Budapest | 2016-01-03
21:26:34.0 | 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 |
+------------------+----------------------------------------+------------------------+------------------------+------------------------+
{noformat}
The *to_bp* values show a 1h offset because LOCAL treated as UTC instead of
system's timezone and converting from it offsets the calculated value.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)