Amit Katti created DRILL-1195: --------------------------------- Summary: Querying nested array elements in Json returns only null values Key: DRILL-1195 URL: https://issues.apache.org/jira/browse/DRILL-1195 Project: Apache Drill Issue Type: Bug Components: Storage - JSON Reporter: Amit Katti
If a Json obejct has an array which is present at the 1st(uppermost) level it can be queried in Drill successfully as follows {code} {"rownum":1, "arrayval": [ "a1", "a2", "a3" ]} {"rownum":2, "arrayval": [ "b1", "b2", "b3" ]} {"rownum":3, "arrayval": [ "c1", "c2", "c3" ]} {"rownum":4, "arrayval": [ "d1", "d2", "d3" ]} {"rownum":5, "arrayval": [ "e1", "e2", "e3" ]} select tbl.arrayval[0] from `nested_working.json` tbl; +------------+ | EXPR$0 | +------------+ | a1 | | b1 | | c1 | | d1 | | e1 | +------------+ 5 rows selected (0.157 seconds) {code} However if the array is present in the 2nd level or below (inner level), directly querying the array's elements throws only null values. {code} {"rownum":1, "a":{"arrayval": [ "a1", "a2", "a3" ]}} {"rownum":2, "a":{"arrayval": [ "b1", "b2", "b3" ]}} {"rownum":3, "a":{"arrayval": [ "c1", "c2", "c3" ]}} {"rownum":4, "a":{"arrayval": [ "d1", "d2", "d3" ]}} {"rownum":5, "a":{"arrayval": [ "e1", "e2", "e3" ]}} select tbl.a.arrayval[0] from `nested_failing.json` tbl; +------------+ | EXPR$0 | +------------+ | null | | null | | null | | null | | null | +------------+ 5 rows selected (0.144 seconds) {code} Select * as well as selecting the whole array works fine though. {code} select tbl.a.arrayval from `/user/root/nested_failing.json` tbl; +------------+ | EXPR$0 | +------------+ | ["a1","a2","a3"] | | ["b1","b2","b3"] | | ["c1","c2","c3"] | | ["d1","d2","d3"] | | ["e1","e2","e3"] | +------------+ 5 rows selected (0.151 seconds) {code} On the other hand if the array's elements are json objects again, we are able to successfully query them. The issue seems to be centric only to single element objects. {code} {"rownum":1, "a":{"arrayval": [ {"val1":"a1"}, {"val2":"a2"}, {"val3":"a3"} ]}} {"rownum":2, "a":{"arrayval": [ {"val1":"b1"}, {"val2":"b2"}, {"val3":"b3"} ]}} {"rownum":3, "a":{"arrayval": [ {"val1":"c1"}, {"val2":"c2"}, {"val3":"c3"} ]}} {"rownum":4, "a":{"arrayval": [ {"val1":"d1"}, {"val2":"d2"}, {"val3":"d3"} ]}} {"rownum":5, "a":{"arrayval": [ {"val1":"e1"}, {"val2":"e2"}, {"val3":"e3"} ]}} select tbl.a.arrayval[0] from `nested_working2.json` tbl; +------------+ | EXPR$0 | +------------+ | {"val1":"a1"} | | {"val1":"b1"} | | {"val1":"c1"} | | {"val1":"d1"} | | {"val1":"e1"} | +------------+ 5 rows selected (0.186 seconds) select tbl.a.arrayval[0].val1 from `nested_working2.json` tbl; +------------+ | EXPR$0 | +------------+ | a1 | | b1 | | c1 | | d1 | | e1 | +------------+ 5 rows selected (0.157 seconds) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)