Paul Rogers created DRILL-5974:
----------------------------------

             Summary: Read JSON non-relational fields using text mode
                 Key: DRILL-5974
                 URL: https://issues.apache.org/jira/browse/DRILL-5974
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.13.0
            Reporter: Paul Rogers
            Assignee: Paul Rogers
             Fix For: 1.13.0


Proposed is a minor enhancement to the JSON reader to better handle 
non-relational JSON structures.

As background, Drill handles simple tuples:

{code}
{a: 10, b: “fred”}
{code}

Drill also handles arrays:

{code}
{name: “fred”, hobbies: [“bowling”, “golf”]}
{code}

Drill even handles arrays of tuples:

{code}
{name: “fred”, orders: [
  {id: 1001, amount: 12.34},
  {id: 1002, amount: 56.78}]}
{code}

The above are termed "relational" because there is a straightforward mapping 
to/from tables into the above JSON structures.

Things get interesting with non-relational types, such as 2-D arrays:

{code}
{id: 4, shape: “square”, points: [[0, 0], [0, 5], [5, 0], [5, 5]]}
{code}

Drill has two solutions:

* Turn on the experimental list and union support.
* Enable all-text mode to read all fields as JSON text.

Proposed is a middle ground:

* Read fields with relational types into vectors.
* Read non-relational fields using text mode.

Thus, the first three examples would all result in the JSON data parsed into 
Drill vectors. But, the fourth, non-relational example would produce a row that 
looks like this:

{noformat}
id, shape, points
4, “shape”, “[[0, 0], [0, 5], [5, 0], [5, 5]]”
{noformat}

Although Drill can’t parse the 2-D array, Drill will pass the array along to 
the client, which can use its favorite JSON parser to parse the array and do 
something useful (like draw the square in this case.)

Specifically, the proposal is to:

* Apply this change only to the revised “batch size aware” JSON reader.
* Use the above parsing model by default.
* Use the experimental list-and-union support if the existing 
{{exec.enable_union_type}} system/session option is set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to