[
https://issues.apache.org/jira/browse/DRILL-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17482554#comment-17482554
]
James Turton commented on DRILL-8100:
-------------------------------------
[~Paul.Rogers] I've reproduced the problem now. It shows up when
store.json.extended_types = true and timestamps get written out in ISO 8601
format with a trailing Z. I wanted to do some checking of whether, under
store.json.extended_types = false, we will start providing different answers
when we read zoneless timestamps written out by earlier versions of Drill after
this PR and I turned up this round trip failure in a single session using only
the PR version of Drill.
{code}
1 row selected (0.12 seconds)
apache drill> create table dfs.tmp.foo3 as select now();
Fragment 0_0
Number of records written 1
1 row selected (0.328 seconds)
apache drill> select * from dfs.tmp.foo3;
EXPR$0 *2022-01-26 14:21:17.996* -- wrong, offset by my time zone
1 row selected (0.212 seconds)
{code}
> JSON record writer does not convert Drill local timestamp to UTC
> ----------------------------------------------------------------
>
> Key: DRILL-8100
> URL: https://issues.apache.org/jira/browse/DRILL-8100
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.19.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Priority: Major
>
> Drill follows the old SQL engine convention to store the `TIMESTAMP` type in
> the local time zone. This is, of course, highly awkward in today's age when
> UTC is used as the standard timestamp in most products. However, it is how
> Drill works. (It would be great to add a `UTC_TIMESTAMP` type, but that is
> another topic.)
> Each reader or writer that works with files that hold UTC timestamps must
> convert to (reader) or from (writer) Drill's local-time timestamp. Otherwise,
> Drill works correctly only when the server time zone is set to UTC.
> The JSON writer does not do the proper conversion, causing tests to fail when
> run in a time zone other than UTC.
> {noformat}
> @Override
> public void writeTimestamp(FieldReader reader) throws IOException {
> if (reader.isSet()) {
> writeTimestamp(reader.readLocalDateTime());
> } else {
> writeTimeNull();
> }
> }
> {noformat}
> Basically, it takes a {{LocalDateTime}}, and formats it as a UTC timezone
> (using the "Z" suffix.) This is only valid if the machine is in the UTC time
> zone, which is why the test for this class attempts to force the local time
> zone to UTC, something that must users will not do.
> A consequence of this bug is that "round trip" CTAS will change dates by the
> UTC offset of the machine running the CTAS. In the Pacific time zone, each
> "round trip" subtracts 8 hours from the time. After three round trips, the
> "UTC" date in the Parquet file or JSON will be a day earlier than the
> original data. One might argue that this "feature" is not always helpful.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)