What you describe (storing in UTC and adjusting to local time) is the implicit timezone that is associated with the plain TIMEZONE type of ANSI SQL. Excerpts:
Datetime data types that contain time fields (TIME and TIMESTAMP) are maintained in Universal Coordinated Time (UTC), with an explicit or implied time zone part. A TIME or TIMESTAMP that does not specify WITH TIME ZONE has an im- plicit time zone equal to the local time zone for the SQL-session. The value of time represented in the data changes along with the local time zone for the SQL-session. However, the meaning of the time does not change because it is effectively maintained in UTC. Zoltan On Mon, Feb 27, 2017 at 7:34 PM Marcel Kornacker <[email protected]> wrote: > On Mon, Feb 27, 2017 at 8:47 AM, Zoltan Ivanfi <[email protected]> wrote: > > Hi, > > > > Although the draft of SQL-92[1] does not explicitly state that the time > zone > > offset has to be stored, the following excerpts strongly suggest that the > > time zone has to be stored with each individual value of TIMESTAMP WITH > TIME > > ZONE: > > > > The length of a TIMESTAMP is 19 positions [...] > > The length of a TIMESTAMP WITH TIME ZONE is 25 positions [...] > > > > The draft of SQL 2003[2] standard is more specific: > > > > TIMESTAMP and TIME may also be specified as being WITH TIME ZONE, in > which > > case every value has associated with it a time zone displacement. > > It is not clear to me whether that means that the timezone is *stored* > with the value. In Postgres, a timestamp still has an associated > timezone on input and output (but it only stores the timezone > normalized to UTC). > > > > > However, the TIMESTAMP WITH TIME ZONE type in PostgreSQL does not > conform to > > this definition, but behaves like the plain TIMESTAMP type of the SQL > > specification. Oracle calls this type TIMESTAMP WITH LOCAL TIME instead. > > > > It's easier to get an overview on this chaos if we list the different > > interpretations and data types: > > > > Timestamp stored with a time zone: > > - ANSI SQL type: TIMESTAMP WITH TIME ZONE > > - Oracle type: TIMESTAMP WITH TIME ZONE > > - PostgeSQL type: - > > > > Timestamp stored without a time zone that represents a time in UTC > > (automatically converted to/from local time): > > - ANSI SQL type: TIMESTAMP > > - Oracle type: TIMESTAMP WITH LOCAL TIME > > - PostgeSQL type: TIMESTAMP WITH TIME ZONE > > - Parquet type: TIMESTAMP_MILLIS > > > > Timestamp stored without a time zone that represents floating time (has > the > > same apparent value regardless of time zone, does not refer to an > absolute > > time instant): > > - ANSI SQL type: - > > - Oracle type: TIMESTAMP > > - PostgeSQL type: TIMESTAMP > > - Impala type: TIMESTAMP (stored as INT96 in Parquet) > > > > [1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt > > [2] http://www.wiscorp.com/sql_2003_standard.zip > > > > Zoltan > > > > > > > > On Thu, Feb 23, 2017 at 10:46 PM Marcel Kornacker <[email protected]> > wrote: > >> > >> Regarding timestamp with timezone: I'm not sure whether the SQL > >> standard requires the timezone to be stored along with the timestamp > >> for 'timestamp with timezone' (at least Oracle and Postgres diverge on > >> that topic). > >> > >> Cc'ing Greg Rahn to shed some more light on that. > >> > >> Regarding 'make Impala depend on parquet-cpp': could someone expand on > >> why we want to do this? There probably is overlap between > >> Impala/Kudu/parquet-cpp, but the runtime systems of the first two have > >> specific requirements (and are also different from each other), so > >> trying to unify this into parquet-cpp seems difficult. > >> > >> On Thu, Feb 23, 2017 at 11:22 AM, Julien Le Dem <[email protected]> > wrote: > >> > Attendees/agenda: > >> > - Nandor, Zoltan (Cloudera/file formats) > >> > - Lars (Cloudera/Impala)" Statistics progress > >> > - Uwe (Blue Yonder): Parquet cpp RC. Int96 timestamps > >> > - Wes (twosigma): parquet cpp rc. 1.0 Release > >> > - Julien (Dremio): parquet metadata. Statistics. > >> > - Deepak (HP/Vertica): Parquet-cpp > >> > - Kazuaki: > >> > - Ryan was excused :) > >> > > >> > Note: > >> > - Statistics: https://github.com/apache/parquet-format/pull/46 > >> > - Impala is waiting for parquet-format to settle on the format to > >> > finalize their simple mentation. > >> > - Action: Julien to follow up with Ryan on the PR > >> > > >> > - Int96 timestamps: https://github.com/apache/parquet-format/pull/49 > >> > (needs Ryan's feedback) > >> > - format is nanosecond level timestamp from midnight (64 bits) > >> > followed > >> > by number of days (32 bits) > >> > - it sounds like int96 ordering is different from natural byte > array > >> > ordering because days is last in the bytes > >> > - discussion about swapping bytes: > >> > - format dependent on the boost library used > >> > - there could be performance concerns in Impala against changing > >> > it > >> > - there may be a separate project in impala to swap the bytes > for > >> > kudu compatibility. > >> > - discussion about deprecating int96: > >> > - need to be able to read them always > >> > - not need to define ordering if we have a clear replacement > >> > - Need to clarify the requirement for alternative . > >> > - int64 could be enough it sounds that nanosecond granularity > might > >> > not be needed. > >> > - Julien to create JIRAs: > >> > - int96 ordering > >> > - int96 deprecation, replacement. > >> > > >> > - extra timestamp logical type: > >> > - floating timestamp: (not TZ stored. up to the reader to interpret > TS > >> > based on their TZ) > >> > - this would be better for following sql standard > >> > - Julien to create JIRA > >> > - timestamp with timezone (per SQL): > >> > - each value has timezone > >> > - TZ can be different for each value > >> > - Julien to create JIRA > >> > > >> > - parquet-cpp 1.0 release > >> > - Uwe to update release script in master. > >> > - Uwe to launch a new vote with new RC > >> > > >> > - make impala depend on parquet-cpp > >> > - duplication between parquet/impala/kudu > >> > - need to measure level of overlap > >> > - Wes to open JIRA for this > >> > - also need an "apache commons for c++” for SQL type operations: > >> > -> could be in arrow > >> > > >> > - metadata improvements. > >> > - add page level metadata in footer > >> > - page skipping. > >> > - Julien to open JIRA. > >> > > >> > - add version of the writer in the footer (more precise than > current). > >> > - Zoltan to open Jira > >> > - possibly add bitfield for bug fixes. > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > On Thu, Feb 23, 2017 at 10:01 AM, Julien Le Dem <[email protected]> > >> > wrote: > >> > > >> >> https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up > >> >> > >> >> -- > >> >> Julien > >> >> > >> > > >> > > >> > > >> > -- > >> > Julien >
