paul-rogers edited a comment on issue #2298:
URL: https://github.com/apache/drill/issues/2298#issuecomment-902913520


   Hi @luocooong,
   
   Thanks for looking at this issue.
   
   It looks like you are modifying the old, obsolete JSON parser. While it is 
fine to do so, you may want to use the much improved JSON parser in 
`org.apache.drill.exec.store.easy.json`. Else, we're trying to keep two 
entirely different code bases in sync. The new version has many, many bug 
fixes, works with EVF, and supports a provided schema. The new version properly 
handles Mongo date/time types. The new version, but not the older code, has 
abundant unit tests. 
   
   What code uses the old version? Should we consider upgrading that code to 
use the new version?
   
   Here's how things work in the new version.
   
   First, plain JSON has no date-time data type. JSON is limited to string, 
number, Boolean and null. So, we must be talking about "extended types" for 
Mongo. We follow two Mongo specs:
   
   * 
[V1](https://docs.mongodb.com/manual/reference/mongodb-extended-json-v1/#date)
   * 
[V2](https://docs.mongodb.com/manual/reference/mongodb-extended-json/#bson.Date)
   
   The V2 spec says:
   
   > Relaxed: `{"$date": "<ISO-8601 Date/Time Format>"}` <br/>
   > ...  <br/>
   > "<ISO-8601 Date/Time Format>"  <br/>
   > A date in [ISO-8601 Internet Date/Time 
Format](https://tools.ietf.org/html/rfc3339#section-5.6) as string.
   
   The class `org.apache.drill.exec.store.easy.json.loader.TestExtendedTypes` 
tests this functionality. See `testDate()`. This class uses 
`DateTimeFormatter.ISO_INSTANT` to create a formatted date. The resulting `utc` 
variable contains the string `"2020-04-21T18:22:33Z"`. The test then verifies 
that the following "Extended" JSON parses correctly:
   
   ```json
   { "a": { "$date": "2020-04-21T18:22:33Z" } }
   ```
   
   Since the only place that date/time is supported in JSON is for the Mongo 
types, we should follow the Mongo spec for those types. No need to try multiple 
formats, nor to ask the user for the format. The only formats we should support 
are those defined by Mongo.
   
   Second, the new version supports a provided schema. It may be that you are 
working with a JSON file or source witch is not Mongo, yet which includes 
dates. Simply use the provided schema mechanism to mark the `ts` column (say) 
as a `TIMESTAMP` and optionally provide a format. The "new" JSON parser will 
automagically do the conversion.
   
   Where are you using the old JSON parser?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to