Roman created DRILL-4824:
----------------------------
Summary: JSON with complex nested data produces incorrect output
with missing fields
Key: DRILL-4824
URL: https://issues.apache.org/jira/browse/DRILL-4824
Project: Apache Drill
Issue Type: New Feature
Components: Storage - JSON
Affects Versions: 1.7.0
Reporter: Roman
Assignee: Roman
Fix For: Future
There is incorrect output in case of JSON file with complex nested data. Here
is a JSON file:
{code:none|title=example.json|borderStyle=solid}
{
"Field1" : {
}
}
{
"Field1" : {
"InnerField1": {"key1":"value1"},
"InnerField2": {"key2":"value2"}
}
}
{
"Field1" : {
"InnerField3" : ["value3", "value4"],
"InnerField4" : ["value5", "value6"]
}
}
{code}
Here is actual result after command "select Field1 from
dfs.`/tmp/example.json`;":
{code:none}
+---------------------------+
| Field1 |
+---------------------------+
{"InnerField1":{},"InnerField2":{},"InnerField3":[],"InnerField4":[]}
{"InnerField1":{"key1":"value1"},"InnerField2"
{"key2":"value2"},"InnerField3":[],"InnerField4":[]}
{"InnerField1":{},"InnerField2":{},"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]}
+--------------------------+
{code}
I think it is no need to output missing fields. In case of deeply nested
structure we will get unreadable for user result. So my expected result is:
{code:none}
+--------------------------+
| Field1 |
+--------------------------+
|{}
{"InnerField1":{"key1":"value1"},"InnerField2":{"key2":"value2"}}
{"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]}
+--------------------------+
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)