I think this is a lengthier topic than anticipated, so I vote for reopening it at the next sync meeting.
On Tue, Feb 28, 2017 at 8:21 AM, Zoltan Ivanfi <[email protected]> wrote: > I just realized that my example with the DECIMAL type was not a particularly > good one as unlike the timezone of a timestamp, the denominator of a DECIMAL > is the same for all values of a single column. > > Zoltan > > On Tue, Feb 28, 2017 at 4:22 PM Zoltan Ivanfi <[email protected]> wrote: >> >> Hi, >> >> Please see my comments inline. >> >> On Tue, Feb 28, 2017 at 3:00 AM, Greg Rahn <[email protected]> wrote: >>> >>> I think the decision comes down to how many TIMESTAMP types does Parquet >>> (and systems that use it a format) want to support or the use cases that are >>> being targeted. >> >> >> Please note that there already is a timestamp type specified by Parquet: >> TIMESTAMP_MILLIS. It does not contain an explicit timezone, but the >> specification ties it to a UTC instant, so it works differently than >> Impala's "floating" timestamp type stored in an INT96. >> >>> >>> If the answer is two, then it makes sense to follow the ANSI standard and >>> what Postgres et al. have done: >>> - timestamp [ without time zone ] - never adjust for TZ, treated like a >>> timestamp string >>> - timestamp with time zone - normalize to UTC based on explicit TZ input >>> or implicit TZ from env, results normalized to local/client TZ >> >> >> The ANSI SQL-92 timestamp [without time zone] is also stored in UTC and >> adjusted to local time, as the following excerpt shows: >> >> A TIME or TIMESTAMP that does not specify WITH TIME ZONE has an im- >> plicit time zone equal to the local time zone for the SQL-session. >> The value of time represented in the data changes along with the >> local time zone for the SQL-session. However, the meaning of the >> time does not change because it is effectively maintained in UTC. >> >>> >>> If the answer is three, then it makes sense to mimic Oracle’s timestamp >>> types despite the differences from ANSI SQL naming/behavior: >>> - timestamp [ without time zone ] - never adjust for TZ, treated like a >>> timestamp string >>> - timestamp with local time zone - normalize to UTC based on explicit TZ >>> or implicit TZ from env, results normalized to local/client TZ >>> - timestamp with time zone - explicitly store the TZ and never normalize >>> or convert, results may contain >1 TZ >>> >>> It seems to me that the two timestamp type solution is the most popular, >>> appears to solve most requirements, and one could explicitly store the TZ >>> offset in another column and systems could provide a solution/function to >>> convert and return results that contain >1 TZ. >> >> >> I think storing the TZ offset in a separate column is inconvenient and >> also tears one logical type into two parts. It would be like storing a >> DECIMAL in two separate columns (one for the numerator and one for the >> exponent of the denominator). >> >>> >>> Without normalization and storing the TZ explcitly, it forces the >>> application to apply a conversion function for any comparisons operations >>> otherwise it becomes logically impossible (or ambiguous) to apply date range >>> filters on such a type. This has an obvious performance impact. >>> >>> My vote would be for the two type solution, however, it is worth >>> explicitly noting, that "timestamp with time zone" requires functionality >>> beyond the Parquet file format to do the TZ adjustments whether that be done >>> server-side on the result sets knowing client-side TZ settings, or in the >>> client-side driver code. Obviously the latter can result in many more >>> places for bugs to creep in as every driver implementation needs to do the >>> correct TZ adjustment. >> >> >> The implementation of these timestamp types is probably outside of the >> scope of parquet-mr indeed, but the specification of the their >> interpretations belong to parquet-format in my opinion, similar to how the >> current TIMESTMAP_MILLIS type is specified. >> >>> >>> > On Feb 27, 2017, at 4:42 PM, Marcel Kornacker <[email protected]> >>> > wrote: >>> > >>> > Greg, thanks for this writeup. >>> > >>> > Going back to "timestamp with timezone" in Parquet: does anything >>> > speak *against* following the SQL standard and storing UTC without an >>> > attached timezone (and leaving it to the client to do the conversion >>> > correctly for timestamp literals)? >>> > >>> > On Mon, Feb 27, 2017 at 4:03 PM, Greg Rahn <[email protected]> wrote: >>> >> As pointed out, there are several different behaviors and type names >>> >> from >>> >> the SQL world: >>> >> >>> >> timestamp without time zone (aka timestamp) >>> >> >>> >> value not normalized - behaves same as storing the timestamp literal >>> >> as a >>> >> string or datetime with fractional seconds >>> >> value normalized to UTC based on local environment and adjusted to >>> >> local >>> >> timezone on retrieval >>> >> >>> >> timestamp with time zone (aka timestamptz) >>> >> >>> >> value normalized to UTC based on specified time zone offset or value >>> >> or >>> >> falls back to client env when not specified, adjusted to local >>> >> timezone on >>> >> retrieval >>> >> value not normalized and time zone value is actually stored based on >>> >> specified time zone offset or value or falls back to client env when >>> >> not >>> >> specified, values returned are not normalized to local time zone >>> >>> >>>> I'm not sure I would 100% agree with your table Zoltan, or the >>> >>>> comment on >>> >> Postgres not following the ANSI standard, as this is a key description >>> >> from >>> >> the ANSI standard: >>> >> >>> >>> A datetime value, of data type TIME WITHOUT TIME ZONE or TIMESTAMP >>> >>> WITHOUT >>> >>> TIME ZONE, may represent a local time, whereas a datetime value of >>> >>> data type >>> >>> TIME WITH TIME ZONE or TIMESTAMP WITH TIME ZONE represents UTC. >> >> >> I have trouble interpreting the "may" in this sentence of the standard as >> it only states that it may represent a local time but not what else it may >> represent. >> >>> >>> >> Thus one challenge that exists is that there are two valid behaviors >>> >> for >>> >> TIMESTAMP [WITHOUT TIME ZONE] — conversion from/to local time, or no >>> >> adjustment. The other challenge here is that what some RDBMS systems >>> >> call >>> >> the type corresponds to an ANSI type name, however, the behavior >>> >> differs. >>> >> This is exactly the case for ANSI's TIMESTAMP WITH TIME ZONE and >>> >> Oracle's >>> >> TIMESTAMP WITH TIME ZONE as Oracle's TIMESTAMP WITH LOCAL TIME ZONE >>> >> would >>> >> map to ANSI TIMESTAMP WITH TIME ZONE because neither store the TZ >>> >> info, but >>> >> both adjust for it. Oracle's TIMESTAMP WITH TIME ZONE does store the >>> >> TZ >>> >> explicitly but does not adjust. Oracle notes this difference in >>> >> naming >>> >> between ANSI & Oracle using a disclaimer [1]: >>> >> >>> >>> This chapter describes Oracle datetime and interval datatypes. It >>> >>> does not >>> >>> attempt to describe ANSI datatypes or other kinds of datatypes except >>> >>> when >>> >>> noted. >>> >> >>> >> >>> >> Also I will note some wording clarity -- when the ANSI SQL standard >>> >> uses the >>> >> phrase "time zone displacement" this means the data type includes a >>> >> value >>> >> for the time zone offset (displacement from UTC), be it explicitly or >>> >> implicitly. It does not mean that "time zone displacement" is >>> >> actually >>> >> stored on disk as this contradicts this ANSI statement: >>> >> >>> >>> whereas a datetime value of data type TIMESTAMP WITH TIME ZONE >>> >>> represents >>> >>> UTC. >> >> >> In my interpretation this excerpt refers to the two parts of the timestamp >> that are stored: the datetime value and the timezone offset. In this >> interpretation it simply states that 06:24+02:00 should be stored in two >> parts as 04:24 and +02:00 and not as 06:24 and +02:00. This makes sense, >> because this way the timestamp 07:24+03:00, which represents the same >> instant is stored in two parts as 04:24 and +03:00 instead of 07:24 and >> +03:00 This is useful, because the two timestamps represent the same instant >> in different time zones, so storing the time part in UTC allows easy >> comparison (04:24 == 04:24), while storing them in local time would not >> (06:24 != 07:24). This is further confirmed by the formal definitions: >> >> <time string> ::= >> <quote> <time value> [ <time zone interval> ] <quote> >> >> <timestamp string> ::= >> <quote> <date value> <space> <time value> [ <time zone >> interval> ] <quote> >> >> These define the <time value> and <date value> parts. The informal text >> talks about "time value" and "datetime value" parts for the time and >> timestamp types respectively, I think these are meant to refer to <time >> value> and <date value> + <time value>, respectively. >> >>> >>> >> So IMO, all the Postgres-based systems (Vertica, Redshift, Greenplum, >>> >> etc) >>> >> implement the ANSI standard for TIMESTAMP WITH TIME ZONE — they >>> >> normalize >>> >> input values to UTC and return all values converted to local time >>> >> zone. >>> >> Greenplum explicitly cites their TIMESTAMP implementation is ANSI >>> >> SQL:2008 >>> >> compliant for feature id F051-03 [2] (which is the ANSI TIMESTAMP >>> >> type). >>> >> >>> >> Also see these behavior notes and documentation. >>> >> >>> >> Vertica - >>> >> >>> >> https://my.vertica.com/docs/8.0.x/HTML/index.htm#Authoring/SQLReferenceManual/DataTypes/Date-Time/TIMESTAMP.htm >>> >> PostgreSQL - >>> >> >>> >> https://www.postgresql.org/docs/current/static/datatype-datetime.html#DATATYPE-TIMEZONES >>> >> Redshift - >>> >> >>> >> http://docs.aws.amazon.com/redshift/latest/dg/r_Datetime_types.html#r_Datetime_types-timestamptz >>> >> >>> >> >>> >> If I look at all of this and compare it to the Parquet Logical Types >>> >> [3] >>> >> doc, to me the thing that is missing is an unambiguous statement of >>> >> client >>> >> behavior for timezone conversion or not. >> >> >> I think the Parquet definition mandates adjusting the value to UTC: >> >> It must annotate an int64 that stores the number of milliseconds from >> the Unix epoch, 00:00:00.000 on 1 January 1970, UTC. >> >> Since the epoch is defined to be a time instant specified in UTC, time >> zone normalization has to happen. Of course you could change the definition >> to some equivalent epoch in another timezone, for example 01:00:00+01:00, >> but you still have to do a timezone adjustment to calculate the elapsed >> time. You can not come up with an equivalent definition that involves the >> local timezone, because without an explicit timezone, neither 00:00 nor any >> other time on 1 January 1970 refers to a single time instant. Timestamps in >> the floating interpretation are ambiguous, their meaning depends on the >> local timezone, thus having an epoch specified as an instant in absolute >> time mandates a timestamp interpretation that also represents an instant in >> absolute time and floating time does not. >> >>> >>> >> To my knowledge Apache Drill is >>> >> the only system that seems to have implemented the Parquet >>> >> TIMESTAMP_MILLIS >>> >> type and it looks like they chose to implement it using the normalize >>> >> to UTC >>> >> behavior which is not my preference and not what most RDBMS systems >>> >> do, >>> >> including Postgres, Vertica, Redshift, Greenplum, Netezza, and Oracle >>> >> do for >>> >> TIMESTAMP [WITHOUT TIME ZONE]. >>> >> >>> >> For example, changing the local system timezone setting does not >>> >> change the >>> >> results of the query for a TIMESTAMP [WITHOUT TIME ZONE] type: >>> >> >>> >> create table ts1 (t timestamp without time zone); >>> >> insert into ts1 values (timestamp '2009-05-12 12:00:00'); >>> >> insert into ts1 values (timestamp '2009-05-13 12:00:00'); >>> >> insert into ts1 values (timestamp '2009-05-14 12:00:00'); >>> >> >>> >> show timezone; >>> >> name | setting >>> >> ----------+------------ >>> >> timezone | US/Eastern >>> >> >>> >> select * from ts1; >>> >> t >>> >> --------------------- >>> >> 2009-05-12 12:00:00 >>> >> 2009-05-13 12:00:00 >>> >> 2009-05-14 12:00:00 >>> >> >>> >> set timezone US/Pacific; >>> >> show timezone; >>> >> >>> >> name | setting >>> >> ----------+------------ >>> >> timezone | US/Pacific >>> >> >>> >> select * from ts1; >>> >> t >>> >> --------------------- >>> >> 2009-05-12 12:00:00 >>> >> 2009-05-13 12:00:00 >>> >> 2009-05-14 12:00:00 >>> >> >>> >> >>> >> BTW, Presto has come to realize their implementation is different than >>> >> the >>> >> ANSI SQL standard as well. This GitHub issue [4] has a good writeup >>> >> of >>> >> things. >>> >> >>> >> References >>> >> [1] >>> >> >>> >> http://docs.oracle.com/database/122/NLSPG/datetime-data-types-and-time-zone-support.htm#NLSPG237 >>> >> [2] >>> >> https://gpdb.docs.pivotal.io/4320/ref_guide/SQL2008_support.html#topic1 >>> >> [3] >>> >> https://github.com/apache/parquet-format/blob/master/LogicalTypes.md >>> >> [4] https://github.com/prestodb/presto/issues/7122 >>> >> >>> >> >>> >> >>> >> >>> >> On Mon, Feb 27, 2017 at 10:45 AM Marcel Kornacker <[email protected]> >>> >> wrote: >>> >>> >>> >>> On Mon, Feb 27, 2017 at 10:43 AM, Zoltan Ivanfi <[email protected]> >>> >>> wrote: >>> >>>> What you describe (storing in UTC and adjusting to local time) is >>> >>>> the >>> >>>> implicit timezone that is associated with the plain TIMEZONE type of >>> >>>> ANSI >>> >>>> SQL. Excerpts: >>> >>> >>> >>> Postgres allows explicit timezone offsets in timestamp literals. When >>> >>> these are present they override the implicit local timezone. >>> >> >>> >> >>> >> >>> >> >>> >>> >>> >>> Datetime data types that contain time fields (TIME and TIMESTAMP) >>> >>> are >>> >>> maintained >>> >>> in Universal Coordinated Time (UTC), with an explicit or implied >>> >>> time >>> >>> zone >>> >>> part. >>> >>> >>> >>> A TIME or TIMESTAMP that does not specify WITH TIME ZONE has an im- >>> >>> plicit time zone equal to the local time zone for the SQL-session. >>> >>> The value of time represented in the data changes along with the >>> >>> local time zone for the SQL-session. However, the meaning of the >>> >>> time does not change because it is effectively maintained in UTC. >>> >>> >>> >>> Zoltan >>> >>> >>> >>> On Mon, Feb 27, 2017 at 7:34 PM Marcel Kornacker <[email protected]> >>> >>> wrote: >>> >>>> >>> >>>> On Mon, Feb 27, 2017 at 8:47 AM, Zoltan Ivanfi <[email protected]> >>> >>>> wrote: >>> >>>>> Hi, >>> >>>>> >>> >>>>> Although the draft of SQL-92[1] does not explicitly state that the >>> >>>>> time >>> >>>>> zone >>> >>>>> offset has to be stored, the following excerpts strongly suggest >>> >>>>> that >>> >>>>> the >>> >>>>> time zone has to be stored with each individual value of TIMESTAMP >>> >>>>> WITH >>> >>>>> TIME >>> >>>>> ZONE: >>> >>>>> >>> >>>>> The length of a TIMESTAMP is 19 positions [...] >>> >>>>> The length of a TIMESTAMP WITH TIME ZONE is 25 positions [...] >>> >>>>> >>> >>>>> The draft of SQL 2003[2] standard is more specific: >>> >>>>> >>> >>>>> TIMESTAMP and TIME may also be specified as being WITH TIME ZONE, >>> >>>>> in >>> >>>>> which >>> >>>>> case every value has associated with it a time zone displacement. >>> >>>> >>> >>>> It is not clear to me whether that means that the timezone is >>> >>>> *stored* >>> >>>> with the value. In Postgres, a timestamp still has an associated >>> >>>> timezone on input and output (but it only stores the timezone >>> >>>> normalized to UTC). >>> >>>> >>> >>>>> >>> >>>>> However, the TIMESTAMP WITH TIME ZONE type in PostgreSQL does not >>> >>>>> conform to >>> >>>>> this definition, but behaves like the plain TIMESTAMP type of the >>> >>>>> SQL >>> >>>>> specification. Oracle calls this type TIMESTAMP WITH LOCAL TIME >>> >>>>> instead. >>> >>>>> >>> >>>>> It's easier to get an overview on this chaos if we list the >>> >>>>> different >>> >>>>> interpretations and data types: >>> >>>>> >>> >>>>> Timestamp stored with a time zone: >>> >>>>> - ANSI SQL type: TIMESTAMP WITH TIME ZONE >>> >>>>> - Oracle type: TIMESTAMP WITH TIME ZONE >>> >>>>> - PostgeSQL type: - >>> >>>>> >>> >>>>> Timestamp stored without a time zone that represents a time in UTC >>> >>>>> (automatically converted to/from local time): >>> >>>>> - ANSI SQL type: TIMESTAMP >>> >>>>> - Oracle type: TIMESTAMP WITH LOCAL TIME >>> >>>>> - PostgeSQL type: TIMESTAMP WITH TIME ZONE >>> >>>>> - Parquet type: TIMESTAMP_MILLIS >>> >>>>> >>> >>>>> Timestamp stored without a time zone that represents floating time >>> >>>>> (has >>> >>>>> the >>> >>>>> same apparent value regardless of time zone, does not refer to an >>> >>>>> absolute >>> >>>>> time instant): >>> >>>>> - ANSI SQL type: - >>> >>>>> - Oracle type: TIMESTAMP >>> >>>>> - PostgeSQL type: TIMESTAMP >>> >>>>> - Impala type: TIMESTAMP (stored as INT96 in Parquet) >>> >>>>> >>> >>>>> [1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt >>> >>>>> [2] http://www.wiscorp.com/sql_2003_standard.zip >>> >>>>> >>> >>>>> Zoltan >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> On Thu, Feb 23, 2017 at 10:46 PM Marcel Kornacker >>> >>>>> <[email protected]> >>> >>>>> wrote: >>> >>>>>> >>> >>>>>> Regarding timestamp with timezone: I'm not sure whether the SQL >>> >>>>>> standard requires the timezone to be stored along with the >>> >>>>>> timestamp >>> >>>>>> for 'timestamp with timezone' (at least Oracle and Postgres >>> >>>>>> diverge on >>> >>>>>> that topic). >>> >>>>>> >>> >>>>>> Cc'ing Greg Rahn to shed some more light on that. >>> >>>>>> >>> >>>>>> Regarding 'make Impala depend on parquet-cpp': could someone >>> >>>>>> expand on >>> >>>>>> why we want to do this? There probably is overlap between >>> >>>>>> Impala/Kudu/parquet-cpp, but the runtime systems of the first two >>> >>>>>> have >>> >>>>>> specific requirements (and are also different from each other), so >>> >>>>>> trying to unify this into parquet-cpp seems difficult. >>> >>>>>> >>> >>>>>> On Thu, Feb 23, 2017 at 11:22 AM, Julien Le Dem >>> >>>>>> <[email protected]> >>> >>>>>> wrote: >>> >>>>>>> Attendees/agenda: >>> >>>>>>> - Nandor, Zoltan (Cloudera/file formats) >>> >>>>>>> - Lars (Cloudera/Impala)" Statistics progress >>> >>>>>>> - Uwe (Blue Yonder): Parquet cpp RC. Int96 timestamps >>> >>>>>>> - Wes (twosigma): parquet cpp rc. 1.0 Release >>> >>>>>>> - Julien (Dremio): parquet metadata. Statistics. >>> >>>>>>> - Deepak (HP/Vertica): Parquet-cpp >>> >>>>>>> - Kazuaki: >>> >>>>>>> - Ryan was excused :) >>> >>>>>>> >>> >>>>>>> Note: >>> >>>>>>> - Statistics: https://github.com/apache/parquet-format/pull/46 >>> >>>>>>> - Impala is waiting for parquet-format to settle on the format >>> >>>>>>> to >>> >>>>>>> finalize their simple mentation. >>> >>>>>>> - Action: Julien to follow up with Ryan on the PR >>> >>>>>>> >>> >>>>>>> - Int96 timestamps: >>> >>>>>>> https://github.com/apache/parquet-format/pull/49 >>> >>>>>>> (needs Ryan's feedback) >>> >>>>>>> - format is nanosecond level timestamp from midnight (64 bits) >>> >>>>>>> followed >>> >>>>>>> by number of days (32 bits) >>> >>>>>>> - it sounds like int96 ordering is different from natural byte >>> >>>>>>> array >>> >>>>>>> ordering because days is last in the bytes >>> >>>>>>> - discussion about swapping bytes: >>> >>>>>>> - format dependent on the boost library used >>> >>>>>>> - there could be performance concerns in Impala against >>> >>>>>>> changing >>> >>>>>>> it >>> >>>>>>> - there may be a separate project in impala to swap the >>> >>>>>>> bytes >>> >>>>>>> for >>> >>>>>>> kudu compatibility. >>> >>>>>>> - discussion about deprecating int96: >>> >>>>>>> - need to be able to read them always >>> >>>>>>> - not need to define ordering if we have a clear replacement >>> >>>>>>> - Need to clarify the requirement for alternative . >>> >>>>>>> - int64 could be enough it sounds that nanosecond granularity >>> >>>>>>> might >>> >>>>>>> not be needed. >>> >>>>>>> - Julien to create JIRAs: >>> >>>>>>> - int96 ordering >>> >>>>>>> - int96 deprecation, replacement. >>> >>>>>>> >>> >>>>>>> - extra timestamp logical type: >>> >>>>>>> - floating timestamp: (not TZ stored. up to the reader to >>> >>>>>>> interpret >>> >>>>>>> TS >>> >>>>>>> based on their TZ) >>> >>>>>>> - this would be better for following sql standard >>> >>>>>>> - Julien to create JIRA >>> >>>>>>> - timestamp with timezone (per SQL): >>> >>>>>>> - each value has timezone >>> >>>>>>> - TZ can be different for each value >>> >>>>>>> - Julien to create JIRA >>> >>>>>>> >>> >>>>>>> - parquet-cpp 1.0 release >>> >>>>>>> - Uwe to update release script in master. >>> >>>>>>> - Uwe to launch a new vote with new RC >>> >>>>>>> >>> >>>>>>> - make impala depend on parquet-cpp >>> >>>>>>> - duplication between parquet/impala/kudu >>> >>>>>>> - need to measure level of overlap >>> >>>>>>> - Wes to open JIRA for this >>> >>>>>>> - also need an "apache commons for c++” for SQL type operations: >>> >>>>>>> -> could be in arrow >>> >>>>>>> >>> >>>>>>> - metadata improvements. >>> >>>>>>> - add page level metadata in footer >>> >>>>>>> - page skipping. >>> >>>>>>> - Julien to open JIRA. >>> >>>>>>> >>> >>>>>>> - add version of the writer in the footer (more precise than >>> >>>>>>> current). >>> >>>>>>> - Zoltan to open Jira >>> >>>>>>> - possibly add bitfield for bug fixes. >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> On Thu, Feb 23, 2017 at 10:01 AM, Julien Le Dem >>> >>>>>>> <[email protected]> >>> >>>>>>> wrote: >>> >>>>>>> >>> >>>>>>>> >>> >>>>>>>> https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up >>> >>>>>>>> >>> >>>>>>>> -- >>> >>>>>>>> Julien >>> >>>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> -- >>> >>>>>>> Julien >>> >> >
