Drill version: 1.4.0. Assuming 3 JSON objects with the following structure:
{ ...
"batters":
{
"batter":
[
{ "id": "1001", "type": "Regular" },
{ "id": "1002", "type": "Chocolate" },
{ "id": "1003", "type": "Blueberry" },
{ "id": "1004", "type": "Devil's Food" }
]
},
...
}
Now running a few sample queries against the above data:
A) select "batters" returns expected results, which are the values of
"batters" from each row.
0: jdbc:drill:zk=local> select batters from dfs.`c:\tmp\sample.json`;
+---------+
| batters |
+---------+
|
{"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"1003","type":"Blueberry"},{"id":"1004","type":"Devil's
Food"}]} |
| {"batter":[{"id":"1001","type":"Regular"}]} |
| {"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"}]} |
+---------+
3 rows selected (0.243 seconds)
B) select "batters.batter" also returns the expected results, which are
the array values for "batters.batter" from each row.
0: jdbc:drill:zk=local> select t.batters.batter from dfs.`c:\tmp\sample.json` t;
+--------+
| EXPR$0 |
+--------+
|
[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"1003","type":"Blueberry"},{"id":"1004","type":"Devil's
Food"}] |
| [{"id":"1001","type":"Regular"}] |
| [{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"}] |
+--------+
3 rows selected (0.198 seconds)
C) select "batters.batter.id" returns something unexpected:
0: jdbc:drill:zk=local> select t.batters.batter.id from
dfs.`c:\tmp\sample.json` t;
+---------+
| EXPR$0 |
+---------+
| 1001 |
| 1002 |
| 1003 |
+---------+
The above result doesn't make sense. The result looks like the 3 values from
row 1. Should the result be the following instead?
+---------+
| EXPR$0 |
+---------+
| [1001, 1002, 1003, 1004] |
| [1001] |
| [1001, 1002] |
+---------+
Any hints on what is happening here? Thanks.
-- Jiang