[
https://issues.apache.org/jira/browse/IMPALA-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Philip Zeyliger updated IMPALA-7730:
------------------------------------
Attachment: orc.zip
> Improve ORC File Format Timezone issues
> ---------------------------------------
>
> Key: IMPALA-7730
> URL: https://issues.apache.org/jira/browse/IMPALA-7730
> Project: IMPALA
> Issue Type: Task
> Components: Backend
> Affects Versions: Impala 3.0
> Reporter: Philip Zeyliger
> Priority: Major
> Attachments: orc.zip
>
>
> As pointed out in https://gerrit.cloudera.org/#/c/11731 by [~csringhofer],
> our support for the ORC file format doesn't follow the same timezone
> conventions as the rest of Impala.
> {quote}
> tldr: ORC's timezone handling is likely to be broken in Impala so we should
> patch it in the toolchain
> The ORC library implements its own IANA timezone handling to convert stored
> timestamps from UTC to local time + do something similar for min/max stats.
> The writer's timezone can be also stored in .orc files and used instead of
> local timezone.
> Impala's and ORC library's timezone can be different because of several
> reasons:
> ORC's timezone is not overridden by env var TZ and query option timezone
> ORC uses a simpler way to detect the local timezone which may not work on
> some Linux distros (see TimezoneDatabase::LocalZoneName in Impala vs
> LOCAL_TIMEZONE in Orc)
> .orc files can use any time zone as writer's timezone and we cannot be sure
> that it will exist on the reader machine
> My suggestion is to patch the ORC library in the toolchain and remove
> timezone handling (e.g. by always using UTC, maybe depending on a flag), as
> the way it is currently working is likely to be broken and is surely not
> consistent with the rest of Impala.
> I am not sure how timezones could be handled correctly in Orc + Impala. If
> someone plans to work on it, I would gladly help in the integration to Impala.
> {quote}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]