Paolo,
Drill currently reads single JSON objects as single records. If you look at
the top of your file you can see that the root of your document is a single
JSON object.
Drill accepts two formats for individual records:
The Mongo import format, a series of JSON object one after the other in a
file, whitespace is irrelevant, each one need not be followed by a newline
{"a" : 1, "b" : "hello" }
{"a": 5 : "b" : "guten tag" }
A JSON array of objects
[
{"a" : 1, "b" : "hello" },
{"a" : 2, "b", "guten tag"}
]
When you have a file like this you can read it using the FLATTEN
functionality of Drill to turn an array into a series of records.
https://drill.apache.org/docs/flatten/
select flatten(rides) as flat_rides from dfs.tmp.`rideDB.json`;
To work with the data further, you can put the flatten call in a subquery.
Here is how you can select the first element from each records list of
INTERVALS and select one of the nested fields inside of METRICS once the
data has been flattened.
To analyze the array, you could flatten again to get an exploded dataset
with one record per interval across all records
select t.flat_rides.INTERVALS[0], t.flat_rides.METRICS.skiba_wprime_low
from (select flatten(rides) as flat_rides from dfs.tmp.`rideDB.json`) as t;
Here you can see that individual columns can be selected next to the
flatten call, this will copy the data into each new record:
select flatten(t.flat_rides.INTERVALS) as flat_intervals,
t.flat_rides.METRICS.skiba_wprime_low from (select flatten(rides) as
flat_rides from dfs.tmp.`rideDB.json`) as t;
Happy Drilling!
On Sun, Jan 10, 2016 at 4:23 AM, Paolo Spanevello <[email protected]>
wrote:
> Hi all,
>
> i'm trying to query the file that you can find in attach with drill apache
> 1.4 . The result of this qurey is always 1 record.
>
> The query that i'm running is :
>
> SELECT t.rides.INTERVALS.METRICS FROM rideDB.json t
>
> If i run the similar query with the file donuts.json found on
> https://drill.apache.org/docs/sample-data-donuts/ the query runs properly.
>
> SELECT t.topping FROM donuts.json t
>
> Thanks in advance.
>
> Paolo
>