On Mon, Feb 27, 2017 at 10:43 AM, Zoltan Ivanfi <[email protected]> wrote:
> What you describe (storing in UTC and adjusting to local time) is the
> implicit timezone that is associated with the plain TIMEZONE type of ANSI
> SQL. Excerpts:

Postgres allows explicit timezone offsets in timestamp literals. When
these are present they override the implicit local timezone.

>
>   Datetime data types that contain time fields (TIME and TIMESTAMP) are
> maintained
>   in Universal Coordinated Time (UTC), with an explicit or implied time zone
> part.
>
>   A TIME or TIMESTAMP that does not specify WITH TIME ZONE has an im-
>   plicit time zone equal to the local time zone for the SQL-session.
>   The value of time represented in the data changes along with the
>   local time zone for the SQL-session. However, the meaning of the
>   time does not change because it is effectively maintained in UTC.
>
> Zoltan
>
> On Mon, Feb 27, 2017 at 7:34 PM Marcel Kornacker <[email protected]> wrote:
>>
>> On Mon, Feb 27, 2017 at 8:47 AM, Zoltan Ivanfi <[email protected]> wrote:
>> > Hi,
>> >
>> > Although the draft of SQL-92[1] does not explicitly state that the time
>> > zone
>> > offset has to be stored, the following excerpts strongly suggest that
>> > the
>> > time zone has to be stored with each individual value of TIMESTAMP WITH
>> > TIME
>> > ZONE:
>> >
>> >   The length of a TIMESTAMP is 19 positions [...]
>> >   The length of a TIMESTAMP WITH TIME ZONE is 25 positions [...]
>> >
>> > The draft of SQL 2003[2] standard is more specific:
>> >
>> >   TIMESTAMP and TIME may also be specified as being WITH TIME ZONE, in
>> > which
>> > case every value has associated with it a time zone displacement.
>>
>> It is not clear to me whether that means that the timezone is *stored*
>> with the value. In Postgres, a timestamp still has an associated
>> timezone on input and output (but it only stores the timezone
>> normalized to UTC).
>>
>> >
>> > However, the TIMESTAMP WITH TIME ZONE type in PostgreSQL does not
>> > conform to
>> > this definition, but behaves like the plain TIMESTAMP type of the SQL
>> > specification. Oracle calls this type TIMESTAMP WITH LOCAL TIME instead.
>> >
>> > It's easier to get an overview on this chaos if we list the different
>> > interpretations and data types:
>> >
>> > Timestamp stored with a time zone:
>> > - ANSI SQL type: TIMESTAMP WITH TIME ZONE
>> > - Oracle type: TIMESTAMP WITH TIME ZONE
>> > - PostgeSQL type: -
>> >
>> > Timestamp stored without a time zone that represents a time in UTC
>> > (automatically converted to/from local time):
>> > - ANSI SQL type: TIMESTAMP
>> > - Oracle type: TIMESTAMP WITH LOCAL TIME
>> > - PostgeSQL type: TIMESTAMP WITH TIME ZONE
>> > - Parquet type: TIMESTAMP_MILLIS
>> >
>> > Timestamp stored without a time zone that represents floating time (has
>> > the
>> > same apparent value regardless of time zone, does not refer to an
>> > absolute
>> > time instant):
>> > - ANSI SQL type: -
>> > - Oracle type: TIMESTAMP
>> > - PostgeSQL type: TIMESTAMP
>> > - Impala type: TIMESTAMP (stored as INT96 in Parquet)
>> >
>> > [1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
>> > [2] http://www.wiscorp.com/sql_2003_standard.zip
>> >
>> > Zoltan
>> >
>> >
>> >
>> > On Thu, Feb 23, 2017 at 10:46 PM Marcel Kornacker <[email protected]>
>> > wrote:
>> >>
>> >> Regarding timestamp with timezone: I'm not sure whether the SQL
>> >> standard requires the timezone to be stored along with the timestamp
>> >> for 'timestamp with timezone' (at least Oracle and Postgres diverge on
>> >> that topic).
>> >>
>> >> Cc'ing Greg Rahn to shed some more light on that.
>> >>
>> >> Regarding 'make Impala depend on parquet-cpp': could someone expand on
>> >> why we want to do this? There probably is overlap between
>> >> Impala/Kudu/parquet-cpp, but the runtime systems of the first two have
>> >> specific requirements (and are also different from each other), so
>> >> trying to unify this into parquet-cpp seems difficult.
>> >>
>> >> On Thu, Feb 23, 2017 at 11:22 AM, Julien Le Dem <[email protected]>
>> >> wrote:
>> >> >  Attendees/agenda:
>> >> > - Nandor, Zoltan (Cloudera/file formats)
>> >> > - Lars (Cloudera/Impala)" Statistics progress
>> >> > - Uwe (Blue Yonder): Parquet cpp RC. Int96 timestamps
>> >> > - Wes (twosigma): parquet cpp rc. 1.0 Release
>> >> > - Julien (Dremio): parquet metadata. Statistics.
>> >> > - Deepak (HP/Vertica): Parquet-cpp
>> >> > - Kazuaki:
>> >> > - Ryan was excused :)
>> >> >
>> >> > Note:
>> >> >  - Statistics: https://github.com/apache/parquet-format/pull/46
>> >> >    - Impala is waiting for parquet-format to settle on the format to
>> >> > finalize their simple mentation.
>> >> >    - Action: Julien to follow up with Ryan on the PR
>> >> >
>> >> >  - Int96 timestamps: https://github.com/apache/parquet-format/pull/49
>> >> > (needs Ryan's feedback)
>> >> >    - format is nanosecond level timestamp from midnight (64 bits)
>> >> > followed
>> >> > by number of days (32 bits)
>> >> >    - it sounds like int96 ordering is different from natural byte
>> >> > array
>> >> > ordering because days is last in the bytes
>> >> >    - discussion about swapping bytes:
>> >> >       - format dependent on the boost library used
>> >> >       - there could be performance concerns in Impala against
>> >> > changing
>> >> > it
>> >> >       - there may be a separate project in impala to swap the bytes
>> >> > for
>> >> > kudu compatibility.
>> >> >    - discussion about deprecating int96:
>> >> >      - need to be able to read them always
>> >> >      - not need to define ordering if we have a clear replacement
>> >> >      - Need to clarify the requirement for alternative .
>> >> >      - int64 could be enough it sounds that nanosecond granularity
>> >> > might
>> >> > not be needed.
>> >> >    - Julien to create JIRAs:
>> >> >      - int96 ordering
>> >> >      - int96 deprecation, replacement.
>> >> >
>> >> > - extra timestamp logical type:
>> >> >  - floating timestamp: (not TZ stored. up to the reader to interpret
>> >> > TS
>> >> > based on their TZ)
>> >> >     - this would be better for following sql standard
>> >> >     - Julien to create JIRA
>> >> >  - timestamp with timezone (per SQL):
>> >> >     - each value has timezone
>> >> >     - TZ can be different for each value
>> >> >     - Julien to create JIRA
>> >> >
>> >> >  - parquet-cpp 1.0 release
>> >> >    - Uwe to update release script in master.
>> >> >    - Uwe to launch a new vote with new RC
>> >> >
>> >> >  - make impala depend on parquet-cpp
>> >> >   - duplication between parquet/impala/kudu
>> >> >   - need to measure level of overlap
>> >> >   - Wes to open JIRA for this
>> >> >   - also need an "apache commons for c++” for SQL type operations:
>> >> >      -> could be in arrow
>> >> >
>> >> >   - metadata improvements.
>> >> >    - add page level metadata in footer
>> >> >    - page skipping.
>> >> >    - Julien to open JIRA.
>> >> >
>> >> >  - add version of the writer in the footer (more precise than
>> >> > current).
>> >> >    - Zoltan to open Jira
>> >> >    - possibly add bitfield for bug fixes.
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Thu, Feb 23, 2017 at 10:01 AM, Julien Le Dem <[email protected]>
>> >> > wrote:
>> >> >
>> >> >> https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up
>> >> >>
>> >> >> --
>> >> >> Julien
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Julien

Reply via email to