Hi, Please see my comments inline.
On Tue, Feb 28, 2017 at 3:00 AM, Greg Rahn <[email protected]> wrote: > I think the decision comes down to how many TIMESTAMP types does Parquet > (and systems that use it a format) want to support or the use cases that > are being targeted. > Please note that there already is a timestamp type specified by Parquet: TIMESTAMP_MILLIS. It does not contain an explicit timezone, but the specification ties it to a UTC instant, so it works differently than Impala's "floating" timestamp type stored in an INT96. > If the answer is two, then it makes sense to follow the ANSI standard and > what Postgres et al. have done: > - timestamp [ without time zone ] - never adjust for TZ, treated like a > timestamp string > - timestamp with time zone - normalize to UTC based on explicit TZ input > or implicit TZ from env, results normalized to local/client TZ > The ANSI SQL-92 timestamp [without time zone] is also stored in UTC and adjusted to local time, as the following excerpt shows: A TIME or TIMESTAMP that does not specify WITH TIME ZONE has an im- plicit time zone equal to the local time zone for the SQL-session. The value of time represented in the data changes along with the local time zone for the SQL-session. However, the meaning of the time does not change because it is effectively maintained in UTC. > If the answer is three, then it makes sense to mimic Oracle’s timestamp > types despite the differences from ANSI SQL naming/behavior: > - timestamp [ without time zone ] - never adjust for TZ, treated like a > timestamp string > - timestamp with local time zone - normalize to UTC based on explicit TZ > or implicit TZ from env, results normalized to local/client TZ > - timestamp with time zone - explicitly store the TZ and never normalize > or convert, results may contain >1 TZ > > It seems to me that the two timestamp type solution is the most popular, > appears to solve most requirements, and one could explicitly store the TZ > offset in another column and systems could provide a solution/function to > convert and return results that contain >1 TZ. I think storing the TZ offset in a separate column is inconvenient and also tears one logical type into two parts. It would be like storing a DECIMAL in two separate columns (one for the numerator and one for the exponent of the denominator). > Without normalization and storing the TZ explcitly, it forces the > application to apply a conversion function for any comparisons operations > otherwise it becomes logically impossible (or ambiguous) to apply date > range filters on such a type. This has an obvious performance impact. > > My vote would be for the two type solution, however, it is worth > explicitly noting, that "timestamp with time zone" requires functionality > beyond the Parquet file format to do the TZ adjustments whether that be > done server-side on the result sets knowing client-side TZ settings, or in > the client-side driver code. Obviously the latter can result in many more > places for bugs to creep in as every driver implementation needs to do the > correct TZ adjustment. > The implementation of these timestamp types is probably outside of the scope of parquet-mr indeed, but the specification of the their interpretations belong to parquet-format in my opinion, similar to how the current TIMESTMAP_MILLIS type is specified. > > On Feb 27, 2017, at 4:42 PM, Marcel Kornacker <[email protected]> wrote: > > > > Greg, thanks for this writeup. > > > > Going back to "timestamp with timezone" in Parquet: does anything > > speak *against* following the SQL standard and storing UTC without an > > attached timezone (and leaving it to the client to do the conversion > > correctly for timestamp literals)? > > > > On Mon, Feb 27, 2017 at 4:03 PM, Greg Rahn <[email protected]> wrote: > >> As pointed out, there are several different behaviors and type names > from > >> the SQL world: > >> > >> timestamp without time zone (aka timestamp) > >> > >> value not normalized - behaves same as storing the timestamp literal as > a > >> string or datetime with fractional seconds > >> value normalized to UTC based on local environment and adjusted to local > >> timezone on retrieval > >> > >> timestamp with time zone (aka timestamptz) > >> > >> value normalized to UTC based on specified time zone offset or value or > >> falls back to client env when not specified, adjusted to local > timezone on > >> retrieval > >> value not normalized and time zone value is actually stored based on > >> specified time zone offset or value or falls back to client env when not > >> specified, values returned are not normalized to local time zone > >>>> I'm not sure I would 100% agree with your table Zoltan, or the comment > on > >> Postgres not following the ANSI standard, as this is a key description > from > >> the ANSI standard: > >> > >>> A datetime value, of data type TIME WITHOUT TIME ZONE or TIMESTAMP > WITHOUT > >>> TIME ZONE, may represent a local time, whereas a datetime value of > data type > >>> TIME WITH TIME ZONE or TIMESTAMP WITH TIME ZONE represents UTC. > I have trouble interpreting the "may" in this sentence of the standard as it only states that it may represent a local time but not what else it may represent. > >> Thus one challenge that exists is that there are two valid behaviors for > >> TIMESTAMP [WITHOUT TIME ZONE] — conversion from/to local time, or no > >> adjustment. The other challenge here is that what some RDBMS systems > call > >> the type corresponds to an ANSI type name, however, the behavior > differs. > >> This is exactly the case for ANSI's TIMESTAMP WITH TIME ZONE and > Oracle's > >> TIMESTAMP WITH TIME ZONE as Oracle's TIMESTAMP WITH LOCAL TIME ZONE > would > >> map to ANSI TIMESTAMP WITH TIME ZONE because neither store the TZ info, > but > >> both adjust for it. Oracle's TIMESTAMP WITH TIME ZONE does store the TZ > >> explicitly but does not adjust. Oracle notes this difference in naming > >> between ANSI & Oracle using a disclaimer [1]: > >> > >>> This chapter describes Oracle datetime and interval datatypes. It does > not > >>> attempt to describe ANSI datatypes or other kinds of datatypes except > when > >>> noted. > >> > >> > >> Also I will note some wording clarity -- when the ANSI SQL standard > uses the > >> phrase "time zone displacement" this means the data type includes a > value > >> for the time zone offset (displacement from UTC), be it explicitly or > >> implicitly. It does not mean that "time zone displacement" is actually > >> stored on disk as this contradicts this ANSI statement: > >> > >>> whereas a datetime value of data type TIMESTAMP WITH TIME ZONE > represents > >>> UTC. > In my interpretation this excerpt refers to the two parts of the timestamp that are stored: the datetime value and the timezone offset. In this interpretation it simply states that 06:24+02:00 should be stored in two parts as 04:24 and +02:00 and not as 06:24 and +02:00. This makes sense, because this way the timestamp 07:24+03:00, which represents the same instant is stored in two parts as 04:24 and +03:00 instead of 07:24 and +03:00 This is useful, because the two timestamps represent the same instant in different time zones, so storing the time part in UTC allows easy comparison (04:24 == 04:24), while storing them in local time would not (06:24 != 07:24). This is further confirmed by the formal definitions: <time string> ::= <quote> <time value> [ <time zone interval> ] <quote> <timestamp string> ::= <quote> <date value> <space> <time value> [ <time zone interval> ] <quote> These define the <time value> and <date value> parts. The informal text talks about "time value" and "datetime value" parts for the time and timestamp types respectively, I think these are meant to refer to <time value> and <date value> + <time value>, respectively. > >> So IMO, all the Postgres-based systems (Vertica, Redshift, Greenplum, > etc) > >> implement the ANSI standard for TIMESTAMP WITH TIME ZONE — they > normalize > >> input values to UTC and return all values converted to local time zone. > >> Greenplum explicitly cites their TIMESTAMP implementation is ANSI > SQL:2008 > >> compliant for feature id F051-03 [2] (which is the ANSI TIMESTAMP type). > >> > >> Also see these behavior notes and documentation. > >> > >> Vertica - > >> https://my.vertica.com/docs/8.0.x/HTML/index.htm#Authoring/ > SQLReferenceManual/DataTypes/Date-Time/TIMESTAMP.htm > >> PostgreSQL - > >> https://www.postgresql.org/docs/current/static/datatype- > datetime.html#DATATYPE-TIMEZONES > >> Redshift - > >> http://docs.aws.amazon.com/redshift/latest/dg/r_Datetime_ > types.html#r_Datetime_types-timestamptz > >> > >> > >> If I look at all of this and compare it to the Parquet Logical Types [3] > >> doc, to me the thing that is missing is an unambiguous statement of > client > >> behavior for timezone conversion or not. > I think the Parquet definition mandates adjusting the value to UTC: It must annotate an int64 that stores the number of milliseconds from the Unix epoch, 00:00:00.000 on 1 January 1970, UTC. Since the epoch is defined to be a time instant specified in UTC, time zone normalization has to happen. Of course you could change the definition to some equivalent epoch in another timezone, for example 01:00:00+01:00, but you still have to do a timezone adjustment to calculate the elapsed time. You can not come up with an equivalent definition that involves the local timezone, because without an explicit timezone, neither 00:00 nor any other time on 1 January 1970 refers to a single time instant. Timestamps in the floating interpretation are ambiguous, their meaning depends on the local timezone, thus having an epoch specified as an instant in absolute time mandates a timestamp interpretation that also represents an instant in absolute time and floating time does not. > >> To my knowledge Apache Drill is > >> the only system that seems to have implemented the Parquet > TIMESTAMP_MILLIS > >> type and it looks like they chose to implement it using the normalize > to UTC > >> behavior which is not my preference and not what most RDBMS systems do, > >> including Postgres, Vertica, Redshift, Greenplum, Netezza, and Oracle > do for > >> TIMESTAMP [WITHOUT TIME ZONE]. > >> > >> For example, changing the local system timezone setting does not change > the > >> results of the query for a TIMESTAMP [WITHOUT TIME ZONE] type: > >> > >> create table ts1 (t timestamp without time zone); > >> insert into ts1 values (timestamp '2009-05-12 12:00:00'); > >> insert into ts1 values (timestamp '2009-05-13 12:00:00'); > >> insert into ts1 values (timestamp '2009-05-14 12:00:00'); > >> > >> show timezone; > >> name | setting > >> ----------+------------ > >> timezone | US/Eastern > >> > >> select * from ts1; > >> t > >> --------------------- > >> 2009-05-12 12:00:00 > >> 2009-05-13 12:00:00 > >> 2009-05-14 12:00:00 > >> > >> set timezone US/Pacific; > >> show timezone; > >> > >> name | setting > >> ----------+------------ > >> timezone | US/Pacific > >> > >> select * from ts1; > >> t > >> --------------------- > >> 2009-05-12 12:00:00 > >> 2009-05-13 12:00:00 > >> 2009-05-14 12:00:00 > >> > >> > >> BTW, Presto has come to realize their implementation is different than > the > >> ANSI SQL standard as well. This GitHub issue [4] has a good writeup of > >> things. > >> > >> References > >> [1] > >> http://docs.oracle.com/database/122/NLSPG/datetime- > data-types-and-time-zone-support.htm#NLSPG237 > >> [2] https://gpdb.docs.pivotal.io/4320/ref_guide/SQL2008_ > support.html#topic1 > >> [3] https://github.com/apache/parquet-format/blob/master/ > LogicalTypes.md > >> [4] https://github.com/prestodb/presto/issues/7122 > >> > >> > >> > >> > >> On Mon, Feb 27, 2017 at 10:45 AM Marcel Kornacker <[email protected]> > wrote: > >>> > >>> On Mon, Feb 27, 2017 at 10:43 AM, Zoltan Ivanfi <[email protected]> > wrote: > >>>> What you describe (storing in UTC and adjusting to local time) is the > >>>> implicit timezone that is associated with the plain TIMEZONE type of > >>>> ANSI > >>>> SQL. Excerpts: > >>> > >>> Postgres allows explicit timezone offsets in timestamp literals. When > >>> these are present they override the implicit local timezone. > >> > >> > >> > >> > >>> > >>> Datetime data types that contain time fields (TIME and TIMESTAMP) are > >>> maintained > >>> in Universal Coordinated Time (UTC), with an explicit or implied time > >>> zone > >>> part. > >>> > >>> A TIME or TIMESTAMP that does not specify WITH TIME ZONE has an im- > >>> plicit time zone equal to the local time zone for the SQL-session. > >>> The value of time represented in the data changes along with the > >>> local time zone for the SQL-session. However, the meaning of the > >>> time does not change because it is effectively maintained in UTC. > >>> > >>> Zoltan > >>> > >>> On Mon, Feb 27, 2017 at 7:34 PM Marcel Kornacker <[email protected]> > >>> wrote: > >>>> > >>>> On Mon, Feb 27, 2017 at 8:47 AM, Zoltan Ivanfi <[email protected]> > wrote: > >>>>> Hi, > >>>>> > >>>>> Although the draft of SQL-92[1] does not explicitly state that the > time > >>>>> zone > >>>>> offset has to be stored, the following excerpts strongly suggest that > >>>>> the > >>>>> time zone has to be stored with each individual value of TIMESTAMP > WITH > >>>>> TIME > >>>>> ZONE: > >>>>> > >>>>> The length of a TIMESTAMP is 19 positions [...] > >>>>> The length of a TIMESTAMP WITH TIME ZONE is 25 positions [...] > >>>>> > >>>>> The draft of SQL 2003[2] standard is more specific: > >>>>> > >>>>> TIMESTAMP and TIME may also be specified as being WITH TIME ZONE, in > >>>>> which > >>>>> case every value has associated with it a time zone displacement. > >>>> > >>>> It is not clear to me whether that means that the timezone is *stored* > >>>> with the value. In Postgres, a timestamp still has an associated > >>>> timezone on input and output (but it only stores the timezone > >>>> normalized to UTC). > >>>> > >>>>> > >>>>> However, the TIMESTAMP WITH TIME ZONE type in PostgreSQL does not > >>>>> conform to > >>>>> this definition, but behaves like the plain TIMESTAMP type of the SQL > >>>>> specification. Oracle calls this type TIMESTAMP WITH LOCAL TIME > >>>>> instead. > >>>>> > >>>>> It's easier to get an overview on this chaos if we list the different > >>>>> interpretations and data types: > >>>>> > >>>>> Timestamp stored with a time zone: > >>>>> - ANSI SQL type: TIMESTAMP WITH TIME ZONE > >>>>> - Oracle type: TIMESTAMP WITH TIME ZONE > >>>>> - PostgeSQL type: - > >>>>> > >>>>> Timestamp stored without a time zone that represents a time in UTC > >>>>> (automatically converted to/from local time): > >>>>> - ANSI SQL type: TIMESTAMP > >>>>> - Oracle type: TIMESTAMP WITH LOCAL TIME > >>>>> - PostgeSQL type: TIMESTAMP WITH TIME ZONE > >>>>> - Parquet type: TIMESTAMP_MILLIS > >>>>> > >>>>> Timestamp stored without a time zone that represents floating time > (has > >>>>> the > >>>>> same apparent value regardless of time zone, does not refer to an > >>>>> absolute > >>>>> time instant): > >>>>> - ANSI SQL type: - > >>>>> - Oracle type: TIMESTAMP > >>>>> - PostgeSQL type: TIMESTAMP > >>>>> - Impala type: TIMESTAMP (stored as INT96 in Parquet) > >>>>> > >>>>> [1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt > >>>>> [2] http://www.wiscorp.com/sql_2003_standard.zip > >>>>> > >>>>> Zoltan > >>>>> > >>>>> > >>>>> > >>>>> On Thu, Feb 23, 2017 at 10:46 PM Marcel Kornacker <[email protected] > > > >>>>> wrote: > >>>>>> > >>>>>> Regarding timestamp with timezone: I'm not sure whether the SQL > >>>>>> standard requires the timezone to be stored along with the timestamp > >>>>>> for 'timestamp with timezone' (at least Oracle and Postgres diverge > on > >>>>>> that topic). > >>>>>> > >>>>>> Cc'ing Greg Rahn to shed some more light on that. > >>>>>> > >>>>>> Regarding 'make Impala depend on parquet-cpp': could someone expand > on > >>>>>> why we want to do this? There probably is overlap between > >>>>>> Impala/Kudu/parquet-cpp, but the runtime systems of the first two > have > >>>>>> specific requirements (and are also different from each other), so > >>>>>> trying to unify this into parquet-cpp seems difficult. > >>>>>> > >>>>>> On Thu, Feb 23, 2017 at 11:22 AM, Julien Le Dem <[email protected]> > >>>>>> wrote: > >>>>>>> Attendees/agenda: > >>>>>>> - Nandor, Zoltan (Cloudera/file formats) > >>>>>>> - Lars (Cloudera/Impala)" Statistics progress > >>>>>>> - Uwe (Blue Yonder): Parquet cpp RC. Int96 timestamps > >>>>>>> - Wes (twosigma): parquet cpp rc. 1.0 Release > >>>>>>> - Julien (Dremio): parquet metadata. Statistics. > >>>>>>> - Deepak (HP/Vertica): Parquet-cpp > >>>>>>> - Kazuaki: > >>>>>>> - Ryan was excused :) > >>>>>>> > >>>>>>> Note: > >>>>>>> - Statistics: https://github.com/apache/parquet-format/pull/46 > >>>>>>> - Impala is waiting for parquet-format to settle on the format to > >>>>>>> finalize their simple mentation. > >>>>>>> - Action: Julien to follow up with Ryan on the PR > >>>>>>> > >>>>>>> - Int96 timestamps: > >>>>>>> https://github.com/apache/parquet-format/pull/49 > >>>>>>> (needs Ryan's feedback) > >>>>>>> - format is nanosecond level timestamp from midnight (64 bits) > >>>>>>> followed > >>>>>>> by number of days (32 bits) > >>>>>>> - it sounds like int96 ordering is different from natural byte > >>>>>>> array > >>>>>>> ordering because days is last in the bytes > >>>>>>> - discussion about swapping bytes: > >>>>>>> - format dependent on the boost library used > >>>>>>> - there could be performance concerns in Impala against > >>>>>>> changing > >>>>>>> it > >>>>>>> - there may be a separate project in impala to swap the bytes > >>>>>>> for > >>>>>>> kudu compatibility. > >>>>>>> - discussion about deprecating int96: > >>>>>>> - need to be able to read them always > >>>>>>> - not need to define ordering if we have a clear replacement > >>>>>>> - Need to clarify the requirement for alternative . > >>>>>>> - int64 could be enough it sounds that nanosecond granularity > >>>>>>> might > >>>>>>> not be needed. > >>>>>>> - Julien to create JIRAs: > >>>>>>> - int96 ordering > >>>>>>> - int96 deprecation, replacement. > >>>>>>> > >>>>>>> - extra timestamp logical type: > >>>>>>> - floating timestamp: (not TZ stored. up to the reader to interpret > >>>>>>> TS > >>>>>>> based on their TZ) > >>>>>>> - this would be better for following sql standard > >>>>>>> - Julien to create JIRA > >>>>>>> - timestamp with timezone (per SQL): > >>>>>>> - each value has timezone > >>>>>>> - TZ can be different for each value > >>>>>>> - Julien to create JIRA > >>>>>>> > >>>>>>> - parquet-cpp 1.0 release > >>>>>>> - Uwe to update release script in master. > >>>>>>> - Uwe to launch a new vote with new RC > >>>>>>> > >>>>>>> - make impala depend on parquet-cpp > >>>>>>> - duplication between parquet/impala/kudu > >>>>>>> - need to measure level of overlap > >>>>>>> - Wes to open JIRA for this > >>>>>>> - also need an "apache commons for c++” for SQL type operations: > >>>>>>> -> could be in arrow > >>>>>>> > >>>>>>> - metadata improvements. > >>>>>>> - add page level metadata in footer > >>>>>>> - page skipping. > >>>>>>> - Julien to open JIRA. > >>>>>>> > >>>>>>> - add version of the writer in the footer (more precise than > >>>>>>> current). > >>>>>>> - Zoltan to open Jira > >>>>>>> - possibly add bitfield for bug fixes. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Thu, Feb 23, 2017 at 10:01 AM, Julien Le Dem <[email protected] > > > >>>>>>> wrote: > >>>>>>> > >>>>>>>> https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Julien > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Julien > >
