paul-rogers commented on issue #2298:
URL: https://github.com/apache/drill/issues/2298#issuecomment-903221943


   @luocooong, I may be confused, but as I read the Mongo spec, it does want 
the UTC "Zulu" format: `2019-09-30T20:47:43Z`. I verified that this format is 
tested and does work correctly in the new JSON loader.
   
   When you say "without a timezone", I think you are describing a date/time of 
the form `2019-09-30T20:47:43`. Ths would be a local time. Drill uses local 
time internally, but Mongo (wisely) seems to use UTC time. Thus, if you are 
reading Mongo data, you should not see a local time (if I understand Mongo 
correctly.) By the way, `2019-09-30T20:47:43Z` does have a time zone: it is 
zero offset, also called GMT.
   
   Looks like your testing shows that the "Zulu" format is broken in the old 
JSON parser. I don't know why the code might have used the format you 
described. That format does not seem to match the Mongo spec. Perhaps the 
original author was trying to match some other version?
   
   I looked at the code in your call stack. It is pretty convoluted -- another 
reason for the new JSON parser. The JSON parser tries to handle all maps the 
same. Mongo extended types are, syntactically, a JSON map. The 
`MapVectorOutput.run()` method checks for Mongo keywords. For the date/time 
keyword, the code then calls `VectorOutput$MapVectorOutput.writeTimestamp`. I 
suspect this is where things went wrong. The 
`VectorOutput$MapVectorOutput.writeTimestamp` method is not unique to Mongo 
JSON, it is a generic vector method. As you note, at present it uses the 
`isoFormatTime` constant in `DateUtility`:
   
   ```java
     public static final DateTimeFormatter isoFormatTime     = 
buildFormatter("HH:mm:ss.SSSXX");
   ```
   
   For Mongo, it should use the `UTC_FORMATTER` constant:
   
   ```java
     public static final DateTimeFormatter UTC_FORMATTER = 
buildFormatter("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'");
   ```
   
   Checking the file history, it looks like the following commit broke things: 
"DRILL-6242 Use java.time.Local{Date|Time|DateTime} for Drill Date, Time, 
Timestamp types." (Use "Blame" on the `VectorOutput.java` file.) My guess is 
that the author wanted to make sure Drill used only local times, and did not 
realize that he was breaking Mongo which requires ISO "Zulu" timestamps.
   
   A quick check of the code suggests that only the old `JsonReader` uses this 
code path. So, you can try reverting the code to use the `UTC_FORMATTER` and 
rerun unit tests. Also check against your Mongo test case. If both of these 
work, then this is the simplest fix.
   
   Now, it could be that something in the tests uses a Mongo-format date/time, 
but with a Drill-like local time. If so, then we can look at the problem an 
think about how to solve it. Let's see if running the tests tells us if we even 
have this problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to