[
https://issues.apache.org/jira/browse/DRILL-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404087#comment-17404087
]
ASF GitHub Bot commented on DRILL-7989:
---------------------------------------
paul-rogers commented on pull request #2299:
URL: https://github.com/apache/drill/pull/2299#issuecomment-905020455
Hi @luocooong, thanks for the explanation. You probably know more about
Mongo than I do: my knowledge comes from the Mongo specs. The code you
highlighted suggests that Mongo is sending BSON (?), which the Mongo client
converts to JSON (?).
So, we have a number of unknowns:
* What is the format of data sent from Mongo to Drill? JSON? BSON? Something
else?
* Which code creates the JSON that we want tDrill to parse? What is the
format of that JSON?
* Which of the Mongo-specified date formats does that JSON include?
* How do we ensure the "old" reader correctly reads the JSON which the Mongo
plugin gives it?
I don't really understand the first three items: my knowledge is limited
only to the Mongo specs, the "new" JSON reader and, to some degree, to the
"old" JSON reader. Please feel free to explain how you actually see Mongo
working to fill in the unknowns above.
The key point in my previous comments on this PR is that Mongo appears to
provide dates in UTC: at least, that is what the specs say. Since Drill stores
dates in local time, conversion from UTC-to-local is necessary when converting
Mongo JSON to Drill value vectors. I've assumed that the JSON we want to parse
follows the Mongo specs (and thus is in UTC.) If the JSON is something else
(dates are already in local time?), then we have to rethink the approach.
BTW: that Drill uses local time for timestamps is something that has long
been debated. It is far too late to change the existing types. Folks have
proposed adding a `UTC_TIMESTAMP` type, though the work has never been
completed.
I think you'll need to poke around a bit and explain what you see happening
to fill in the unknowns listed above.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Use the UTC formatter in the JSON reader
> ----------------------------------------
>
> Key: DRILL-7989
> URL: https://issues.apache.org/jira/browse/DRILL-7989
> Project: Apache Drill
> Issue Type: Improvement
> Reporter: Cong Luo
> Assignee: Cong Luo
> Priority: Major
> Fix For: 1.20.0
>
>
> MongoDB use the UTC format to specify the date value by default. But the JSON
> reader (old version) use the fixed date formatter :
> "yyyy-MM-dd'T'HH:mm:ss.SSSXX". Need to change to the
> "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'" format.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)