Philip Zeyliger created IMPALA-7730:
---------------------------------------

             Summary: Improve ORC File Format Timezone issues
                 Key: IMPALA-7730
                 URL: https://issues.apache.org/jira/browse/IMPALA-7730
             Project: IMPALA
          Issue Type: Task
          Components: Backend
    Affects Versions: Impala 3.0
            Reporter: Philip Zeyliger


As pointed out in https://gerrit.cloudera.org/#/c/11731 by [~csringhofer], our 
support for the ORC file format doesn't follow the same timezone conventions as 
the rest of Impala.

{quote}
tldr: ORC's timezone handling is likely to be broken in Impala so we should 
patch it in the toolchain

The ORC library implements its own IANA timezone handling to convert stored 
timestamps from UTC to local time + do something similar for min/max stats. The 
writer's timezone can be also stored in .orc files and used instead of local 
timezone.

Impala's and ORC library's timezone can be different because of several reasons:

ORC's timezone is not overridden by env var TZ and query option timezone
ORC uses a simpler way to detect the local timezone which may not work on some 
Linux distros (see TimezoneDatabase::LocalZoneName in Impala vs LOCAL_TIMEZONE 
in Orc)
.orc files can use any time zone as writer's timezone and we cannot be sure 
that it will exist on the reader machine
My suggestion is to patch the ORC library in the toolchain and remove timezone 
handling (e.g. by always using UTC, maybe depending on a flag), as the way it 
is currently working is likely to be broken and is surely not consistent with 
the rest of Impala.

I am not sure how timezones could be handled correctly in Orc + Impala. If 
someone plans to work on it, I would gladly help in the integration to Impala.
{quote}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to