[ 
https://issues.apache.org/jira/browse/DRILL-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728455#comment-16728455
 ] 

Vitalii Diravka commented on DRILL-5360:
----------------------------------------

[~priteshm]
Drill has a lot of issues and inconsistencies connected to the "time/date" 
datatypes and Time Zones. It is necessary to lead it to the general form.
 Things to be done:
 * Design doc for Drill Date, Time, Timestamp;
 * Revisiting and leading to the SQL standard of Drill "date/time" datatypes. 
Reading, converting, storing internally (ValueVectors), extracting and 
representing of "date/time" datatypes:
 ** Consistency between different data sources (parquet, csv ...)
 ** Date, time should not rely in particular time zone
 ** Issues connected to representing of "date/time" data types for JDBC (server 
and client are in different time zones).
 ** Issues connected to CAST operator
 ** Improvement of "date/time" output (style)
 ** Considering of new data type - TimestampTZ 
([PostgreSQL|https://www.postgresql.org/docs/9.1/static/datatype-datetime.html] 
has equivalent TIMESTAMP WITH TIME ZONE and Calcite has - 
TIMESTAMP_WITH_LOCAL_TIME_ZONE);
 * Moving from [Joda Time|http://www.joda.org/joda-time/] to [Java SE 8 Date 
and 
Time|http://www.oracle.com/technetwork/articles/java/jf14-date-time-2125367.html]

These changes are not backward-compatible. Therefore they should be implemented 
along with design document for the next Drill major version. It seems the best 
one is 2.0.0 version.

> Timestamp type documented as UTC, implemented as local time
> -----------------------------------------------------------
>
>                 Key: DRILL-5360
>                 URL: https://issues.apache.org/jira/browse/DRILL-5360
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Bohdan Kazydub
>            Priority: Critical
>             Fix For: 2.0.0
>
>
> The Drill documentation implies that the {{Timestamp}} type is in UTC:
> bq. JDBC timestamp in year, month, date hour, minute, second, and optional 
> milliseconds format: yyyy-MM-dd HH:mm:ss.SSS. ... TIMESTAMP literals: Drill 
> stores values in Coordinated Universal Time (UTC). Drill supports time 
> functions in the range 1971 to 2037. ... Drill does not support TIMESTAMP 
> with time zone.
> The above is ambiguous. The first part talks about JDBC timestamps. From the 
> JDK Javadoc:
> bq. Timestamp: A thin wrapper around java.util.Date. ... Date class is 
> intended to reflect coordinated universal time (UTC)...
> So, a JDBC timestamp is intended to represent time in UTC. (The "indented to 
> reflect" statement leaves open the possibility of misusing {{Date}} to 
> represent times in other time zones. This was common practice in early Java 
> development and was the reason for the eventual development of the Joda, then 
> Java 8 date/time classes.)
> The Drill documentation implies that timestamp *literals* are in UTC, but a 
> careful read of the documentation does allow an interpretation that the 
> internal representation can be other than UTC. If this is true, then we would 
> also rely on a liberal reading of the Java `Timestamp` class to also not be 
> UTC. (Or, we rely on the Drill JDBC driver to convert from the (unknown) 
> server time zone to a UTC value returned by the Drill JDBC client.)
> Still, a superficial reading (and common practice) would suggest that a Drill 
> Timestamp should be in UTC.
> However, a test on a Mac, with an embedded Drillbit (run in the Pacific time 
> zone, with Daylight Savings Time in effect) shows that the Timestamp binary 
> value is actual local time:
> {code}
>       long before = System.currentTimeMillis();
>       long value = getDateValue(client, "SELECT NOW() FROM (VALUES(1))" );
>       double hrsDiff = (value - before) / (1000.00 * 60 * 60);
>       System.out.println("Hours: " + hrsDiff);
> {code}
> The above gets the actual UTC time from Java. Then, it runs a query that gets 
> Drill's idea of the current time using the {{NOW()}} function. (The 
> {{getDateValue}} function uses the new test framework to access the actual 
> {{long}} value from the returned value vector.) Finally, we compute the 
> difference between the two times, converted to hours. Output:
> {code}
> Hours: -6.9999975
> {code}
> As it turns out, this is the difference between UTC and PDT. So, the time is 
> in local time, not UTC.
> Since the documentation and implementation are both ambiguous, it is hard to 
> know the intent of the Drill Timestamp. Clearly, common practice is to use 
> UTC. But, there is wiggle-room.
> If the Timestamp value is supposed to be local time, then Drill should 
> provide a function to return the server's time zone offset (in ms) from UTC 
> so that the client can to the needed local-to-UTC conversion to get a true 
> timestamp.
> On the other hand, if the Timestamp is supposed to be UTC (per common 
> practice), then {{NOW()}} should not report local time, it should return UTC.
> Further, if {{NOW()}} returns local time, but Timestamp literals are UTC, 
> then it is hard to see how any query can be rationally written if one 
> timestamp value is local, but a literal is UTC.
> So, job #1 is to define the Timestamp semantics. Then, use that to figure out 
> where the bug lies to make implementation consistent with documentation (or 
> visa-versa.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to