[ 
https://issues.apache.org/jira/browse/DRILL-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929121#comment-15929121
 ] 

Paul Rogers commented on DRILL-5360:
------------------------------------

More experiments. Turns out Drill *tries* to follow the JDBC standard and 
return a {{Timestamp}} as UTC. Here is how a {{NOW()}} query seems to work:

* Drill queries the system to get current time, UTC.
* Drill converts UTC to local time and stores it in the {{Timestamp}} vector.
* Drill sends the server-local time to the JDBC client.
* The JDBC client uses the client's local time zone to convert the 
{{Timestamp}} to UTC.
* JDBC converts the (quasi) UTC timestamp to a {{Timestamp}} object.

This works wonderfully if the client and server are in the same time zone. But, 
if they are in different time zones, the answer will be wrong. Consider a 
server in EST, a client in PST. Consider a time in UTC of 2017-03-10T01:00:00.

* Drill retrieves 2017-03-10T01:00:00 UTC from the system.
* Drill coverts the time to EST (UTC-5) or 2017-03-09T20:00:00 EST.
* Drill sends the EST time to the PST client.
* The PST client converts the time to (quasi) UTC using the PST offset: UTC-8: 
2017-03-10T04:00:00 UTC.

Notice that the "UTC" time on the client (04:00:00) is *not* the same as the 
true UTC on the server: 01:00:00.

This is a problem any time the client and server are in different time zones.

Lesson: Drill only supports Timestamp when the client and server are in the 
same time zone (or configured to use the same time zones.)

Also, the documentation is very misleading.

> Timestamp type documented as UTC, implemented as local time
> -----------------------------------------------------------
>
>                 Key: DRILL-5360
>                 URL: https://issues.apache.org/jira/browse/DRILL-5360
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>
> The Drill documentation implies that the {{Timestamp}} type is in UTC:
> bq. JDBC timestamp in year, month, date hour, minute, second, and optional 
> milliseconds format: yyyy-MM-dd HH:mm:ss.SSS. ... TIMESTAMP literals: Drill 
> stores values in Coordinated Universal Time (UTC). Drill supports time 
> functions in the range 1971 to 2037. ... Drill does not support TIMESTAMP 
> with time zone.
> The above is ambiguous. The first part talks about JDBC timestamps. From the 
> JDK Javadoc:
> bq. Timestamp: A thin wrapper around java.util.Date. ... Date class is 
> intended to reflect coordinated universal time (UTC)...
> So, a JDBC timestamp is intended to represent time in UTC. (The "indented to 
> reflect" statement leaves open the possibility of misusing {{Date}} to 
> represent times in other time zones. This was common practice in early Java 
> development and was the reason for the eventual development of the Joda, then 
> Java 8 date/time classes.)
> The Drill documentation implies that timestamp *literals* are in UTC, but a 
> careful read of the documentation does allow an interpretation that the 
> internal representation can be other than UTC. If this is true, then we would 
> also rely on a liberal reading of the Java `Timestamp` class to also not be 
> UTC. (Or, we rely on the Drill JDBC driver to convert from the (unknown) 
> server time zone to a UTC value returned by the Drill JDBC client.)
> Still, a superficial reading (and common practice) would suggest that a Drill 
> Timestamp should be in UTC.
> However, a test on a Mac, with an embedded Drillbit (run in the Pacific time 
> zone, with Daylight Savings Time in effect) shows that the Timestamp binary 
> value is actual local time:
> {code}
>       long before = System.currentTimeMillis();
>       long value = getDateValue(client, "SELECT NOW() FROM (VALUES(1))" );
>       double hrsDiff = (value - before) / (1000.00 * 60 * 60);
>       System.out.println("Hours: " + hrsDiff);
> {code}
> The above gets the actual UTC time from Java. Then, it runs a query that gets 
> Drill's idea of the current time using the {{NOW()}} function. (The 
> {{getDateValue}} function uses the new test framework to access the actual 
> {{long}} value from the returned value vector.) Finally, we compute the 
> difference between the two times, converted to hours. Output:
> {code}
> Hours: -6.9999975
> {code}
> As it turns out, this is the difference between UTC and PDT. So, the time is 
> in local time, not UTC.
> Since the documentation and implementation are both ambiguous, it is hard to 
> know the intent of the Drill Timestamp. Clearly, common practice is to use 
> UTC. But, there is wiggle-room.
> If the Timestamp value is supposed to be local time, then Drill should 
> provide a function to return the server's time zone offset (in ms) from UTC 
> so that the client can to the needed local-to-UTC conversion to get a true 
> timestamp.
> On the other hand, if the Timestamp is supposed to be UTC (per common 
> practice), then {{NOW()}} should not report local time, it should return UTC.
> Further, if {{NOW()}} returns local time, but Timestamp literals are UTC, 
> then it is hard to see how any query can be rationally written if one 
> timestamp value is local, but a literal is UTC.
> So, job #1 is to define the Timestamp semantics. Then, use that to figure out 
> where the bug lies to make implementation consistent with documentation (or 
> visa-versa.)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to