Re: JSON File, Total numbers Record: 1

Paolo Spanevello Tue, 12 Jan 2016 01:21:01 -0800

Hi  All,

Jason,I used your suggests and it works, thanks a lot!


As u wrote i used a subquery to have the all list of INTERVALS as I show
below.

*select t.flat_rides.tags.Athlete as Athlete,t.flat_rides.crc as
crc,flatten(t.flat_rides.INTERVALS) as flat_intervals from (select
flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t*

[image: Immagine incorporata 1]

The attribute "flat_intervals" is full of data that I would like to have
separate them in several attributes as it is showed in the link:
https://drill.apache.org/docs/flatten/
I used a new subquery to have them:


*select tt.Athlete, tt.crc, flatten(tt.flat_intervals) as newflat from
(select t.flat_rides.tags.Athlete as Athlete,t.flat_rides.crc as
crc,flatten(t.flat_rides.INTERVALS) as flat_intervals from (select
flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t) as tt*

but I have this error:

ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query:
select tt.Athlete, tt.crc, flatten(tt.flat_intervals) as newflat from
(select t.flat_rides.tags.Athlete as Athlete,t.flat_rides.crc as
crc,flatten(t.flat_rides.INTERVALS) as flat_intervals from (select
flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t) as tt
[30027]Query execution error. Details:[
SYSTEM ERROR: ClassCastException: Cannot cast
org.apache.drill.exec.vector.complex.MapVector to
org.apache.drill.exec.vector.complex.RepeatedValueVector

Fragment 0:0

[Error Id: a22fe80f-43e4-43cb-bb98-5541ecb92d4c on 192.168.1.101:31010]
]

[image: Immagine incorporata 2]

Thanks in advance!

Paolo


2016-01-11 17:39 GMT+01:00 Jason Altekruse <altekruseja...@gmail.com>:

> Paolo,
>
> Drill currently reads single JSON objects as single records. If you look at
> the top of your file you can see that the root of your document is a single
> JSON object.
>
> Drill accepts two formats for individual records:
>
> The Mongo import format, a series of JSON object one after the other in a
> file, whitespace is irrelevant, each one need not be followed by a newline
>
> {"a" : 1, "b" : "hello" }
> {"a": 5 : "b" : "guten tag" }
>
> A JSON array of objects
>
> [
>     {"a" : 1, "b" : "hello" },
>     {"a" : 2, "b", "guten tag"}
> ]
>
> When you have a file like this you can read it using the FLATTEN
> functionality of Drill to turn an array into a series of records.
> https://drill.apache.org/docs/flatten/
>
> select flatten(rides) as flat_rides from dfs.tmp.`rideDB.json`;
>
> To work with the data further, you can put the flatten call in a subquery.
> Here is how you can select the first element from each records list of
> INTERVALS and select one of the nested fields inside of METRICS once the
> data has been flattened.
> To analyze the array, you could flatten again to get an exploded dataset
> with one record per interval across all records
>
> select t.flat_rides.INTERVALS[0], t.flat_rides.METRICS.skiba_wprime_low
> from (select flatten(rides) as flat_rides from dfs.tmp.`rideDB.json`) as t;
>
> Here you can see that individual columns can be selected next to the
> flatten call, this will copy the data into each new record:
>
> select flatten(t.flat_rides.INTERVALS) as flat_intervals,
> t.flat_rides.METRICS.skiba_wprime_low from (select flatten(rides) as
> flat_rides from dfs.tmp.`rideDB.json`) as t;
>
> Happy Drilling!
>
> On Sun, Jan 10, 2016 at 4:23 AM, Paolo Spanevello <paolosp...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > i'm trying to query the file that you can find in attach with drill
> apache
> > 1.4 . The result of this qurey is always 1 record.
> >
> > The query that i'm running is :
> >
> > SELECT t.rides.INTERVALS.METRICS FROM rideDB.json t
> >
> > If i run the similar query with the file donuts.json found on
> > https://drill.apache.org/docs/sample-data-donuts/ the query runs
> properly.
> >
> > SELECT t.topping FROM donuts.json t
> >
> > Thanks in advance.
> >
> > Paolo
> >
>

Re: JSON File, Total numbers Record: 1

Reply via email to