[ https://issues.apache.org/jira/browse/IMPALA-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Philip Zeyliger updated IMPALA-7730: ------------------------------------ Attachment: orc.zip > Improve ORC File Format Timezone issues > --------------------------------------- > > Key: IMPALA-7730 > URL: https://issues.apache.org/jira/browse/IMPALA-7730 > Project: IMPALA > Issue Type: Task > Components: Backend > Affects Versions: Impala 3.0 > Reporter: Philip Zeyliger > Priority: Major > Attachments: orc.zip > > > As pointed out in https://gerrit.cloudera.org/#/c/11731 by [~csringhofer], > our support for the ORC file format doesn't follow the same timezone > conventions as the rest of Impala. > {quote} > tldr: ORC's timezone handling is likely to be broken in Impala so we should > patch it in the toolchain > The ORC library implements its own IANA timezone handling to convert stored > timestamps from UTC to local time + do something similar for min/max stats. > The writer's timezone can be also stored in .orc files and used instead of > local timezone. > Impala's and ORC library's timezone can be different because of several > reasons: > ORC's timezone is not overridden by env var TZ and query option timezone > ORC uses a simpler way to detect the local timezone which may not work on > some Linux distros (see TimezoneDatabase::LocalZoneName in Impala vs > LOCAL_TIMEZONE in Orc) > .orc files can use any time zone as writer's timezone and we cannot be sure > that it will exist on the reader machine > My suggestion is to patch the ORC library in the toolchain and remove > timezone handling (e.g. by always using UTC, maybe depending on a flag), as > the way it is currently working is likely to be broken and is surely not > consistent with the rest of Impala. > I am not sure how timezones could be handled correctly in Orc + Impala. If > someone plans to work on it, I would gladly help in the integration to Impala. > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org