[ 
https://issues.apache.org/jira/browse/DRILL-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17482554#comment-17482554
 ] 

James Turton commented on DRILL-8100:
-------------------------------------

[~Paul.Rogers] I've reproduced the problem now.  It shows up when 
store.json.extended_types = true and timestamps get written out in ISO 8601 
format with a trailing Z.  I wanted to do some checking of whether, under 
store.json.extended_types = false, we will start providing different answers 
when we read zoneless timestamps written out by earlier versions of Drill after 
this PR and I turned up this round trip failure in a single session using only 
the PR version of Drill.



 
{code}
1 row selected (0.12 seconds)
apache drill> create table dfs.tmp.foo3 as select now();
Fragment                   0_0
Number of records written  1
1 row selected (0.328 seconds)
apache drill> select * from dfs.tmp.foo3;
EXPR$0  *2022-01-26 14:21:17.996* --  wrong, offset by my time zone
1 row selected (0.212 seconds)
{code}
 

> JSON record writer does not convert Drill local timestamp to UTC
> ----------------------------------------------------------------
>
>                 Key: DRILL-8100
>                 URL: https://issues.apache.org/jira/browse/DRILL-8100
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.19.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Major
>
> Drill follows the old SQL engine convention to store the `TIMESTAMP` type in 
> the local time zone. This is, of course, highly awkward in today's age when 
> UTC is used as the standard timestamp in most products. However, it is how 
> Drill works. (It would be great to add a `UTC_TIMESTAMP` type, but that is 
> another topic.)
> Each reader or writer that works with files that hold UTC timestamps must 
> convert to (reader) or from (writer) Drill's local-time timestamp. Otherwise, 
> Drill works correctly only when the server time zone is set to UTC.
> The JSON writer does not do the proper conversion, causing tests to fail when 
> run in a time zone other than UTC.
> {noformat}
>   @Override
>   public void writeTimestamp(FieldReader reader) throws IOException {
>     if (reader.isSet()) {
>       writeTimestamp(reader.readLocalDateTime());
>     } else {
>       writeTimeNull();
>     }
>   }
> {noformat}
> Basically, it takes a {{LocalDateTime}}, and formats it as a UTC timezone 
> (using the "Z" suffix.) This is only valid if the machine is in the UTC time 
> zone, which is why the test for this class attempts to force the local time 
> zone to UTC, something that must users will not do.
> A consequence of this bug is that "round trip" CTAS will change dates by the 
> UTC offset of the machine running the CTAS. In the Pacific time zone, each 
> "round trip" subtracts 8 hours from the time. After three round trips, the 
> "UTC" date in the Parquet file or JSON will be a day earlier than the 
> original data. One might argue that this "feature" is not always helpful.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to