paul-rogers opened a new pull request #1913: DRILL-6953: EVF-based version of the JSON reader URL: https://github.com/apache/drill/pull/1913 Reimplements the JSON reader on top of the EVF. Does not yet handle a provided schema. New JSON parser does not yet reflect any changes made to the "V1" JSON parser in the last year. Does not yet handle the Union and List-of-union types. Enabling those encountered many issues elsewhere in Drill. Provides more robust (but still limited) handling of JSON type ambigutities. Handles runs of nulls before the first non-null value (within the first batch.) Handles runs of empty arrays before the first non-empty array (again, within the first batch.) Handles the case where a null value turns out to be an object or array. Handles reasonable conversions between types. Handling ambiguities makes the new parser more complex than the "V1" version. The new one uses explict states for each kind of JSON object, where as the old one used implicit states expressed via if-statements, which can be a bit hard to follow as the states get more complex. The new "V2" JSON scan is controlled by a new option: store.json.enable_v2_reader, which is false by default in this PR. Adds a "projection type" to the column writer so that the JSON parser can receive a "hint" as to the expected type. The hint is from the form of the projected column: `a[0]`, `a.b` or just `a`. Reimplements a number of JSON tests to test both the original "V1" and the new "V2" versions of the JSON reader. Adds many new tests for the new features of the "V2" parser.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
