[ https://issues.apache.org/jira/browse/DRILL-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293362#comment-16293362 ]
Paul Rogers edited comment on DRILL-6035 at 12/16/17 1:19 AM: -------------------------------------------------------------- h4. JSON Projection Pushdown The JSON reader supports "projection push-down." The simple rules are simple in concept, but complex in details. The project list comes from the query. In its simplest form, it is the list of columns following the {{SELECT}} keyword: {code} SELECT a, b.c, d[0] FROM ... {code} || Projection || JSON Value of "a" || Drill Result || | `a` | Scalar | Projects `a` | | | Array | Projects all elements of `a` | | | Object | Projects all members of `a` | | | Missing | Creates a {{Nullable INT}} (Drill 1.12) or {{Nullable VARCHAR}} (Drill 1.13) column | | | {{null}} | As above | | `a`.`b` | Scalar | Error (`a` must be an object) | | | Scalar array | Error (`a` must be a map or an array of maps) | | | Object that contains `b` | Projects just `b` from object `a` | | | Object that does not contain `b` | Projects a nullable column `b` within map `a` | | | Object array that contains `b` | Projects just be from the objects within array `a` | | | Object array that does not contain `b` | Projects a nullable column `b` within the array of maps | | | Missing | Projects a map `a` that contains a nullable column `b` | | | {{null}} | As above | | a\[0] | Scalar | Error (`a` must be an array) | | | Scalar array | Projects just `a\[0]` as a scalar (the reader projects the entire array, a project operator pulls out the `a\[0]` element) | | | Object | Error (`a` must e an array) | | | Object array | Projects just object (map) `a\[0]` as described above | | | {{null}} | JSON creates an array of null values, project pulls out `a\[0]` | | | Missing | As above | Notes: * The rules above are for Drill 1.13. Drill 1.12 and earlier is different, and requires investigation. * The rules for null values are suble. The type of the null is inferred from the project list in the case of a map (`a`.`b`) or an array (`a\[0]). Previous sections described null handling for the {{SELECT *}} and {{SELECT `a`}} cases. * The rules for projecting map columns apply to both arrays and single maps. (In Drill 1.12 and earlier, the two cases appear to have behaved differently.) was (Author: paul.rogers): h4. JSON Projection Pushdown The JSON reader supports "projection push-down." The simple rules are simple in concept, but complex in details. The project list comes from the query. In its simplest form, it is the list of columns following the {{SELECT}} keyword: {code} SELECT a, b.c, d[0] FROM ... {code} || Projection || JSON Value of `a` || Drill Result || | `a` | Scalar | Projects `a` | | | Array | Projects all elements of `a` | | | Object | Projects all members of `a` | | | Missing | Creates a {{Nullable INT}} (Drill 1.12) or {{Nullable VARCHAR}} (Drill 1.13) column | | | {{null}} | As above | | `a`.`b` | Scalar | Error (`a` must be an object) | | | Scalar array | Error (`a` must be a map or an array of maps) | | | Object that contains `b` | Projects just `b` from object `a` | | | Object that does not contain `b` | Projects a nullable column `b` within map `a` | | | Object array that contains `b` | Projects just be from the objects within array `a` | | | Object array that does not contain `b` | Projects a nullable column `b` within the array of maps | | | Missing | Projects a map `a` that contains a nullable column `b` | | | {{null}} | As above | | a\[0] | Scalar | Error (`a` must be an array) | | | Scalar array | Projects just `a\[0]` as a scalar (the reader projects the entire array, a project operator pulls out the `a\[0]` element) | | | Object | Error (`a` must e an array) | | | Object array | Projects just object (map) `a\[0]` as described above | | | {{null}} | JSON creates an array of null values, project pulls out `a\[0]` | | | Missing | As above | Notes: * The rules above are for Drill 1.13. Drill 1.12 and earlier is different, and requires investigation. * The rules for null values are suble. The type of the null is inferred from the project list in the case of a map (`a`.`b`) or an array (`a\[0]). Previous sections described null handling for the {{SELECT *}} and {{SELECT `a`}} cases. * The rules for projecting map columns apply to both arrays and single maps. (In Drill 1.12 and earlier, the two cases appear to have behaved differently.) > Specify Drill's JSON behavior > ----------------------------- > > Key: DRILL-6035 > URL: https://issues.apache.org/jira/browse/DRILL-6035 > Project: Apache Drill > Issue Type: Improvement > Affects Versions: 1.13.0 > Reporter: Paul Rogers > Assignee: Pritesh Maker > > Drill supports JSON as its native data format. However, experience suggests > that Drill may have limitations in the JSON that Drill supports. This ticket > asks to clarify Drill's expected behavior on various kinds of JSON. > Topics to be addressed: > * Relational vs. non-relational structures > * JSON structures used in practice and how they map to Drill > * Support for varying data types > * Support for missing values, especially across files > These topics are complex, hence the request to provide a detailed > specifications that clarifies what Drill does and does not support (or what > is should and should not support.) -- This message was sent by Atlassian JIRA (v6.4.14#64029)