On Mon, Feb 27, 2017 at 10:43 AM, Zoltan Ivanfi <[email protected]> wrote: > What you describe (storing in UTC and adjusting to local time) is the > implicit timezone that is associated with the plain TIMEZONE type of ANSI > SQL. Excerpts:
Postgres allows explicit timezone offsets in timestamp literals. When these are present they override the implicit local timezone. > > Datetime data types that contain time fields (TIME and TIMESTAMP) are > maintained > in Universal Coordinated Time (UTC), with an explicit or implied time zone > part. > > A TIME or TIMESTAMP that does not specify WITH TIME ZONE has an im- > plicit time zone equal to the local time zone for the SQL-session. > The value of time represented in the data changes along with the > local time zone for the SQL-session. However, the meaning of the > time does not change because it is effectively maintained in UTC. > > Zoltan > > On Mon, Feb 27, 2017 at 7:34 PM Marcel Kornacker <[email protected]> wrote: >> >> On Mon, Feb 27, 2017 at 8:47 AM, Zoltan Ivanfi <[email protected]> wrote: >> > Hi, >> > >> > Although the draft of SQL-92[1] does not explicitly state that the time >> > zone >> > offset has to be stored, the following excerpts strongly suggest that >> > the >> > time zone has to be stored with each individual value of TIMESTAMP WITH >> > TIME >> > ZONE: >> > >> > The length of a TIMESTAMP is 19 positions [...] >> > The length of a TIMESTAMP WITH TIME ZONE is 25 positions [...] >> > >> > The draft of SQL 2003[2] standard is more specific: >> > >> > TIMESTAMP and TIME may also be specified as being WITH TIME ZONE, in >> > which >> > case every value has associated with it a time zone displacement. >> >> It is not clear to me whether that means that the timezone is *stored* >> with the value. In Postgres, a timestamp still has an associated >> timezone on input and output (but it only stores the timezone >> normalized to UTC). >> >> > >> > However, the TIMESTAMP WITH TIME ZONE type in PostgreSQL does not >> > conform to >> > this definition, but behaves like the plain TIMESTAMP type of the SQL >> > specification. Oracle calls this type TIMESTAMP WITH LOCAL TIME instead. >> > >> > It's easier to get an overview on this chaos if we list the different >> > interpretations and data types: >> > >> > Timestamp stored with a time zone: >> > - ANSI SQL type: TIMESTAMP WITH TIME ZONE >> > - Oracle type: TIMESTAMP WITH TIME ZONE >> > - PostgeSQL type: - >> > >> > Timestamp stored without a time zone that represents a time in UTC >> > (automatically converted to/from local time): >> > - ANSI SQL type: TIMESTAMP >> > - Oracle type: TIMESTAMP WITH LOCAL TIME >> > - PostgeSQL type: TIMESTAMP WITH TIME ZONE >> > - Parquet type: TIMESTAMP_MILLIS >> > >> > Timestamp stored without a time zone that represents floating time (has >> > the >> > same apparent value regardless of time zone, does not refer to an >> > absolute >> > time instant): >> > - ANSI SQL type: - >> > - Oracle type: TIMESTAMP >> > - PostgeSQL type: TIMESTAMP >> > - Impala type: TIMESTAMP (stored as INT96 in Parquet) >> > >> > [1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt >> > [2] http://www.wiscorp.com/sql_2003_standard.zip >> > >> > Zoltan >> > >> > >> > >> > On Thu, Feb 23, 2017 at 10:46 PM Marcel Kornacker <[email protected]> >> > wrote: >> >> >> >> Regarding timestamp with timezone: I'm not sure whether the SQL >> >> standard requires the timezone to be stored along with the timestamp >> >> for 'timestamp with timezone' (at least Oracle and Postgres diverge on >> >> that topic). >> >> >> >> Cc'ing Greg Rahn to shed some more light on that. >> >> >> >> Regarding 'make Impala depend on parquet-cpp': could someone expand on >> >> why we want to do this? There probably is overlap between >> >> Impala/Kudu/parquet-cpp, but the runtime systems of the first two have >> >> specific requirements (and are also different from each other), so >> >> trying to unify this into parquet-cpp seems difficult. >> >> >> >> On Thu, Feb 23, 2017 at 11:22 AM, Julien Le Dem <[email protected]> >> >> wrote: >> >> > Attendees/agenda: >> >> > - Nandor, Zoltan (Cloudera/file formats) >> >> > - Lars (Cloudera/Impala)" Statistics progress >> >> > - Uwe (Blue Yonder): Parquet cpp RC. Int96 timestamps >> >> > - Wes (twosigma): parquet cpp rc. 1.0 Release >> >> > - Julien (Dremio): parquet metadata. Statistics. >> >> > - Deepak (HP/Vertica): Parquet-cpp >> >> > - Kazuaki: >> >> > - Ryan was excused :) >> >> > >> >> > Note: >> >> > - Statistics: https://github.com/apache/parquet-format/pull/46 >> >> > - Impala is waiting for parquet-format to settle on the format to >> >> > finalize their simple mentation. >> >> > - Action: Julien to follow up with Ryan on the PR >> >> > >> >> > - Int96 timestamps: https://github.com/apache/parquet-format/pull/49 >> >> > (needs Ryan's feedback) >> >> > - format is nanosecond level timestamp from midnight (64 bits) >> >> > followed >> >> > by number of days (32 bits) >> >> > - it sounds like int96 ordering is different from natural byte >> >> > array >> >> > ordering because days is last in the bytes >> >> > - discussion about swapping bytes: >> >> > - format dependent on the boost library used >> >> > - there could be performance concerns in Impala against >> >> > changing >> >> > it >> >> > - there may be a separate project in impala to swap the bytes >> >> > for >> >> > kudu compatibility. >> >> > - discussion about deprecating int96: >> >> > - need to be able to read them always >> >> > - not need to define ordering if we have a clear replacement >> >> > - Need to clarify the requirement for alternative . >> >> > - int64 could be enough it sounds that nanosecond granularity >> >> > might >> >> > not be needed. >> >> > - Julien to create JIRAs: >> >> > - int96 ordering >> >> > - int96 deprecation, replacement. >> >> > >> >> > - extra timestamp logical type: >> >> > - floating timestamp: (not TZ stored. up to the reader to interpret >> >> > TS >> >> > based on their TZ) >> >> > - this would be better for following sql standard >> >> > - Julien to create JIRA >> >> > - timestamp with timezone (per SQL): >> >> > - each value has timezone >> >> > - TZ can be different for each value >> >> > - Julien to create JIRA >> >> > >> >> > - parquet-cpp 1.0 release >> >> > - Uwe to update release script in master. >> >> > - Uwe to launch a new vote with new RC >> >> > >> >> > - make impala depend on parquet-cpp >> >> > - duplication between parquet/impala/kudu >> >> > - need to measure level of overlap >> >> > - Wes to open JIRA for this >> >> > - also need an "apache commons for c++” for SQL type operations: >> >> > -> could be in arrow >> >> > >> >> > - metadata improvements. >> >> > - add page level metadata in footer >> >> > - page skipping. >> >> > - Julien to open JIRA. >> >> > >> >> > - add version of the writer in the footer (more precise than >> >> > current). >> >> > - Zoltan to open Jira >> >> > - possibly add bitfield for bug fixes. >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > On Thu, Feb 23, 2017 at 10:01 AM, Julien Le Dem <[email protected]> >> >> > wrote: >> >> > >> >> >> https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up >> >> >> >> >> >> -- >> >> >> Julien >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > Julien
