paul-rogers commented on issue #2298:
URL: https://github.com/apache/drill/issues/2298#issuecomment-903221943
@luocooong, I may be confused, but as I read the Mongo spec, it does want
the UTC "Zulu" format: `2019-09-30T20:47:43Z`. I verified that this format is
tested and does work correctly in the new JSON loader.
When you say "without a timezone", I think you are describing a date/time of
the form `2019-09-30T20:47:43`. Ths would be a local time. Drill uses local
time internally, but Mongo (wisely) seems to use UTC time. Thus, if you are
reading Mongo data, you should not see a local time (if I understand Mongo
correctly.) By the way, `2019-09-30T20:47:43Z` does have a time zone: it is
zero offset, also called GMT.
Looks like your testing shows that the "Zulu" format is broken in the old
JSON parser. I don't know why the code might have used the format you
described. That format does not seem to match the Mongo spec. Perhaps the
original author was trying to match some other version?
I looked at the code in your call stack. It is pretty convoluted -- another
reason for the new JSON parser. The JSON parser tries to handle all maps the
same. Mongo extended types are, syntactically, a JSON map. The
`MapVectorOutput.run()` method checks for Mongo keywords. For the date/time
keyword, the code then calls `VectorOutput$MapVectorOutput.writeTimestamp`. I
suspect this is where things went wrong. The
`VectorOutput$MapVectorOutput.writeTimestamp` method is not unique to Mongo
JSON, it is a generic vector method. As you note, at present it uses the
`isoFormatTime` constant in `DateUtility`:
```java
public static final DateTimeFormatter isoFormatTime =
buildFormatter("HH:mm:ss.SSSXX");
```
For Mongo, it should use the `UTC_FORMATTER` constant:
```java
public static final DateTimeFormatter UTC_FORMATTER =
buildFormatter("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'");
```
Checking the file history, it looks like the following commit broke things:
"DRILL-6242 Use java.time.Local{Date|Time|DateTime} for Drill Date, Time,
Timestamp types." (Use "Blame" on the `VectorOutput.java` file.) My guess is
that the author wanted to make sure Drill used only local times, and did not
realize that he was breaking Mongo which requires ISO "Zulu" timestamps.
A quick check of the code suggests that only the old `JsonReader` uses this
code path. So, you can try reverting the code to use the `UTC_FORMATTER` and
rerun unit tests. Also check against your Mongo test case. If both of these
work, then this is the simplest fix.
Now, it could be that something in the tests uses a Mongo-format date/time,
but with a Drill-like local time. If so, then we can look at the problem an
think about how to solve it. Let's see if running the tests tells us if we even
have this problem.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]