[
https://issues.apache.org/jira/browse/IMPALA-10831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jean-Philippe Gravel updated IMPALA-10831:
------------------------------------------
Description:
I take special care to save all my timestamps as UTC. in the table:
{{ CREATE TABLE filter_pulses (}}
{{ tag STRING,}}
{{ ts TIMESTAMP,}}
{{ val DOUBLE}}
{{ )}}
{{ PARTITIONED BY (}}
{{ ym STRING –- ym stands for year-month}}
{{ )}}
{{ STORED AS PARQUET;}}
When I do the following query:
{{ SELECT max(ts) FROM filter_pulses;}}
{{I' expect to recieve '2021-07-23 23:59:59' but get this result (I'm in
America/Montreal, UTC-4:00):}}
1 2021-07-{color:#de350b}23 {color}{color:#de350b}03:59:59{color}
But when I do that other query:
SELECT tag, max(ts) FROM filter_pulses GROUP BY tag ORDER BY tag
I have all my tags max(ts) 23:59:59 as expected:
|4203-PDT-DPB-400-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-410-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-420-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-430-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-440-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-450-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-460-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-470-01:PV|2021-07-22 23:59:59.0|
None of the tags have a maximum timestamp on the 23rd july at 4 am minus one
sec.
I also tested the faulty query on HIVE and the result is the good expected
result of '2021-07-22 23:59:59.0'. {color:#00875a}HIVE is good.{color} IMPALA
interprets the date tz from the parquet data as America/Montreal and convert it
to UTC.
was:
I take special care to save all my timestamps as UTC. in the table:
{{ CREATE TABLE filter_pulses (}}
{{ tag STRING,}}
{{ ts TIMESTAMP,}}
{{ val DOUBLE}}
{{ )}}
{{ PARTITIONED BY (}}
{{ ym STRING –- ym stands for year-month}}
{{ )}}
{{ STORED AS PARQUET;}}
When I do the following query:
{{ SELECT max(ts) FROM filter_pulses;}}
{{I' expect to recieve '2021-07-23 23:59:59' but get this result (I'm in
America/Montreal, UTC-4:00):}}
1 2021-07-23 {color:#de350b}03:59:59{color}
But when I do that other query:
SELECT tag, max(ts) FROM filter_pulses GROUP BY tag ORDER BY tag
I have all my tags max(ts) 23:59:59 as expected:
|4203-PDT-DPB-400-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-410-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-420-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-430-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-440-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-450-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-460-01:PV|2021-07-22 23:59:59.0|
|4203-PDT-DPB-470-01:PV|2021-07-22 23:59:59.0|
None of the tags have a maximum timestamp on the 23rd july at 4 am minus one
sec.
I also tested the faulty query on HIVE and the result is the good expected
result of '2021-07-22 23:59:59.0'. {color:#00875a}HIVE is good.{color} IMPALA
interprets the date tz from the parquet data as America/Montreal and convert it
to UTC.
> Inconsistent time zone conversion to local
> ------------------------------------------
>
> Key: IMPALA-10831
> URL: https://issues.apache.org/jira/browse/IMPALA-10831
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 3.2.0
> Environment: In IMPALA: SELECT version():
> impalad version 3.2.0-cdh6.2.0 RELEASE (build
> edc19942b4debdbfd485fbd26098eef435003f5d) Built on Thu Mar 14 00:14:36 PDT
> 2019
> -------------------
> SERVER (cat /etc/redhat-release):
> CentOS release 6.10 (Final)
> Reporter: Jean-Philippe Gravel
> Priority: Major
>
> I take special care to save all my timestamps as UTC. in the table:
> {{ CREATE TABLE filter_pulses (}}
> {{ tag STRING,}}
> {{ ts TIMESTAMP,}}
> {{ val DOUBLE}}
> {{ )}}
> {{ PARTITIONED BY (}}
> {{ ym STRING –- ym stands for year-month}}
> {{ )}}
> {{ STORED AS PARQUET;}}
> When I do the following query:
> {{ SELECT max(ts) FROM filter_pulses;}}
> {{I' expect to recieve '2021-07-23 23:59:59' but get this result (I'm in
> America/Montreal, UTC-4:00):}}
> 1 2021-07-{color:#de350b}23 {color}{color:#de350b}03:59:59{color}
> But when I do that other query:
> SELECT tag, max(ts) FROM filter_pulses GROUP BY tag ORDER BY tag
> I have all my tags max(ts) 23:59:59 as expected:
> |4203-PDT-DPB-400-01:PV|2021-07-22 23:59:59.0|
> |4203-PDT-DPB-410-01:PV|2021-07-22 23:59:59.0|
> |4203-PDT-DPB-420-01:PV|2021-07-22 23:59:59.0|
> |4203-PDT-DPB-430-01:PV|2021-07-22 23:59:59.0|
> |4203-PDT-DPB-440-01:PV|2021-07-22 23:59:59.0|
> |4203-PDT-DPB-450-01:PV|2021-07-22 23:59:59.0|
> |4203-PDT-DPB-460-01:PV|2021-07-22 23:59:59.0|
> |4203-PDT-DPB-470-01:PV|2021-07-22 23:59:59.0|
> None of the tags have a maximum timestamp on the 23rd july at 4 am minus one
> sec.
> I also tested the faulty query on HIVE and the result is the good expected
> result of '2021-07-22 23:59:59.0'. {color:#00875a}HIVE is good.{color} IMPALA
> interprets the date tz from the parquet data as America/Montreal and convert
> it to UTC.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]