[ 
https://issues.apache.org/jira/browse/DRILL-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293477#comment-16293477
 ] 

Paul Rogers edited comment on DRILL-6035 at 12/16/17 1:07 AM:
--------------------------------------------------------------

h4. JSON Structure

The [JSON standard|https://tools.ietf.org/html/rfc7159] as [described more 
clearly here|https://www.json.org] states that a JSON document is a single 
value (null, scalar, object or list). Drill support a non-standard (but common) 
extension that allows a list of objects.

|| Document Structure || JSON Standard || Drill Support ||
| Empty | Invalid | Empty list of records |
| null | Valid | Invalid |
| Scalar | Valid | Invalid |
| Array | Valid | Valid (in Drill 1.13) as long as the value is an array of 
objects |
| Object | Valid | Single record |
| List of objects | Invalid | List of records |

In Drill, there must be no commas between top-level objects. This is a clear 
difference compared to the JSON standard which requires commas to separate 
items in a list or object. (This difference is because Drill's JSON file 
structure is not JSON. Think of it instead as a serialized set of JSON objects.)

h4. Drill JSON Document Structure

Thus, a typical JSON input file in Drill is:

{code}
{a: 10, b: "foo"}
{a: 20, b: "bar"}
{code}

h4. Top-Level Array

As noted above, for JSON compatibility, Drill also supports a top-level array 
of objects:

{code}
[
 {a: 10, b: "foo"},
 {a: 20, b: "bar"}
]
{code}

Note that, when the objects are in an array, a comma must separate objects.

The above applies to a JSON text file. No separator is implied (or needed) if 
the data comes from a document database, a Kafka stream or other non-file 
sources.


was (Author: paul.rogers):
h4. JSON Structure

The [JSON standard|https://tools.ietf.org/html/rfc7159] as [described more 
clearly here|https://www.json.org] states that a JSON document is a single 
value (null, scalar, object or list). Drill support a non-standard (but common) 
extension that allows a list of objects.

|| Document Structure || JSON Standard || Drill Support ||
| Empty | Invalid | Empty list of records |
| null | Valid | Invalid |
| Scalar | Valid | Invalid |
| Array | Valid | Valid (in Drill 1.13) as long as the value is an array of 
objects |
| Object | Valid | Single record |
| List of objects | Invalid | List of records |

In Drill, there must be no commas between top-level objects. This is a clear 
difference compared to the JSON standard which requires commas to separate 
items in a list or object. (This difference is because Drill's JSON file 
structure is not JSON. Think of it instead as a serialized set of JSON objects.)

h4. Drill JSON Document Structure

Thus, a typical JSON input file in Drill is:

{code}
{a: 10, b: "foo"}
{a: 20, b: "bar"}
{code}

As noted above, for JSON compatibility, Drill also supports a top-level array 
of objects:

{code}
[
 {a: 10, b: "foo"},
 {a: 20, b: "bar"}
]
{code}

Note that, when the objects are in an array, a comma must separate objects.

The above applies to a JSON text file. No separator is implied (or needed) if 
the data comes from a document database, a Kafka stream or other non-file 
sources.

> Specify Drill's JSON behavior
> -----------------------------
>
>                 Key: DRILL-6035
>                 URL: https://issues.apache.org/jira/browse/DRILL-6035
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.13.0
>            Reporter: Paul Rogers
>            Assignee: Pritesh Maker
>
> Drill supports JSON as its native data format. However, experience suggests 
> that Drill may have limitations in the JSON that Drill supports. This ticket 
> asks to clarify Drill's expected behavior on various kinds of JSON.
> Topics to be addressed:
> * Relational vs. non-relational structures
> * JSON structures used in practice and how they map to Drill
> * Support for varying data types
> * Support for missing values, especially across files
> These topics are complex, hence the request to provide a detailed 
> specifications that clarifies what Drill does and does not support (or what 
> is should and should not support.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to