Re: JSON File, Total numbers Record: 1

Jason Altekruse Wed, 13 Jan 2016 11:29:28 -0800

Wasn't thinking about that, I was seeing the images because I was included
directly in the recipients lists.


+1 on text for searchability.

Can you please rerun the queries and post the errors here as text?

On Wed, Jan 13, 2016 at 11:20 AM, Christopher Matta <[email protected]> wrote:

> The apache mailing lists don't support image attachments, it would be
> better to attach text anyway since it's searchable.
>
> Chris Matta
> [email protected]
> 215-701-3146
>
> On Wed, Jan 13, 2016 at 9:55 AM, Paolo Spanevello <[email protected]>
> wrote:
>
>> Hi,
>>
>> thanks again Jason for your support.
>>
>> I'm trying as u suggest, to apply the kvgen function. Below you can find
>> the query and the error.
>>
>> *select tt.flat_intervals from (select
>> flatten(kvgen(t.flat_rides.INTERVALS)) as flat_intervals from (select
>> flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t ) as tt*
>>
>> [image: Immagine incorporata 1]
>>
>>
>> Could u pls suggest how to fix this drill?
>>
>> Best regards,
>> Paolo
>>
>>
>>
>> 2016-01-12 16:31 GMT+01:00 Jason Altekruse <[email protected]>:
>>
>>> This is a poor error messages that is produced when you try to flatten a
>>> field that is not an array, for these fields you can just use the dot
>>> notation to access their inner members (i.e.
>>> flattened_array_of_maps.member_field_in_map). If you have a field where the
>>> keys in a map are "unknown" or you want to do analysis on the keys, please
>>> refer to the KVGEN docs [1]. I have assigned the JIRA that reported this
>>> issue a while ago to myself and will work to improve the message [2].
>>>
>>>
>>> [1] - https://drill.apache.org/docs/kvgen/
>>> [2] - https://issues.apache.org/jira/browse/DRILL-2182
>>>
>>> On Tue, Jan 12, 2016 at 1:20 AM, Paolo Spanevello <[email protected]>
>>> wrote:
>>>
>>>> Hi  All,
>>>>
>>>> Jason,I used your suggests and it works, thanks a lot!
>>>>
>>>> As u wrote i used a subquery to have the all list of INTERVALS as I
>>>> show below.
>>>>
>>>> *select t.flat_rides.tags.Athlete as Athlete,t.flat_rides.crc as
>>>> crc,flatten(t.flat_rides.INTERVALS) as flat_intervals from (select
>>>> flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t*
>>>>
>>>> [image: Immagine incorporata 1]
>>>>
>>>> The attribute "flat_intervals" is full of data that I would like to
>>>> have separate them in several attributes as it is showed in the link:
>>>> https://drill.apache.org/docs/flatten/
>>>> I used a new subquery to have them:
>>>>
>>>>
>>>> *select tt.Athlete, tt.crc, flatten(tt.flat_intervals) as newflat from
>>>> (select t.flat_rides.tags.Athlete as Athlete,t.flat_rides.crc as
>>>> crc,flatten(t.flat_rides.INTERVALS) as flat_intervals from (select
>>>> flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t) as tt*
>>>>
>>>> but I have this error:
>>>>
>>>> ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query:
>>>> select tt.Athlete, tt.crc, flatten(tt.flat_intervals) as newflat from
>>>> (select t.flat_rides.tags.Athlete as Athlete,t.flat_rides.crc as
>>>> crc,flatten(t.flat_rides.INTERVALS) as flat_intervals from (select
>>>> flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t) as tt
>>>> [30027]Query execution error. Details:[
>>>> SYSTEM ERROR: ClassCastException: Cannot cast
>>>> org.apache.drill.exec.vector.complex.MapVector to
>>>> org.apache.drill.exec.vector.complex.RepeatedValueVector
>>>>
>>>> Fragment 0:0
>>>>
>>>> [Error Id: a22fe80f-43e4-43cb-bb98-5541ecb92d4c on 192.168.1.101:31010]
>>>> ]
>>>>
>>>> [image: Immagine incorporata 2]
>>>>
>>>> Thanks in advance!
>>>>
>>>> Paolo
>>>>
>>>>
>>>> 2016-01-11 17:39 GMT+01:00 Jason Altekruse <[email protected]>:
>>>>
>>>>> Paolo,
>>>>>
>>>>> Drill currently reads single JSON objects as single records. If you
>>>>> look at
>>>>> the top of your file you can see that the root of your document is a
>>>>> single
>>>>> JSON object.
>>>>>
>>>>> Drill accepts two formats for individual records:
>>>>>
>>>>> The Mongo import format, a series of JSON object one after the other
>>>>> in a
>>>>> file, whitespace is irrelevant, each one need not be followed by a
>>>>> newline
>>>>>
>>>>> {"a" : 1, "b" : "hello" }
>>>>> {"a": 5 : "b" : "guten tag" }
>>>>>
>>>>> A JSON array of objects
>>>>>
>>>>> [
>>>>>     {"a" : 1, "b" : "hello" },
>>>>>     {"a" : 2, "b", "guten tag"}
>>>>> ]
>>>>>
>>>>> When you have a file like this you can read it using the FLATTEN
>>>>> functionality of Drill to turn an array into a series of records.
>>>>> https://drill.apache.org/docs/flatten/
>>>>>
>>>>> select flatten(rides) as flat_rides from dfs.tmp.`rideDB.json`;
>>>>>
>>>>> To work with the data further, you can put the flatten call in a
>>>>> subquery.
>>>>> Here is how you can select the first element from each records list of
>>>>> INTERVALS and select one of the nested fields inside of METRICS once
>>>>> the
>>>>> data has been flattened.
>>>>> To analyze the array, you could flatten again to get an exploded
>>>>> dataset
>>>>> with one record per interval across all records
>>>>>
>>>>> select t.flat_rides.INTERVALS[0], t.flat_rides.METRICS.skiba_wprime_low
>>>>> from (select flatten(rides) as flat_rides from dfs.tmp.`rideDB.json`)
>>>>> as t;
>>>>>
>>>>> Here you can see that individual columns can be selected next to the
>>>>> flatten call, this will copy the data into each new record:
>>>>>
>>>>> select flatten(t.flat_rides.INTERVALS) as flat_intervals,
>>>>> t.flat_rides.METRICS.skiba_wprime_low from (select flatten(rides) as
>>>>> flat_rides from dfs.tmp.`rideDB.json`) as t;
>>>>>
>>>>> Happy Drilling!
>>>>>
>>>>> On Sun, Jan 10, 2016 at 4:23 AM, Paolo Spanevello <
>>>>> [email protected]>
>>>>> wrote:
>>>>>
>>>>> > Hi all,
>>>>> >
>>>>> > i'm trying to query the file that you can find in attach with drill
>>>>> apache
>>>>> > 1.4 . The result of this qurey is always 1 record.
>>>>> >
>>>>> > The query that i'm running is :
>>>>> >
>>>>> > SELECT t.rides.INTERVALS.METRICS FROM rideDB.json t
>>>>> >
>>>>> > If i run the similar query with the file donuts.json found on
>>>>> > https://drill.apache.org/docs/sample-data-donuts/ the query runs
>>>>> properly.
>>>>> >
>>>>> > SELECT t.topping FROM donuts.json t
>>>>> >
>>>>> > Thanks in advance.
>>>>> >
>>>>> > Paolo
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>

Re: JSON File, Total numbers Record: 1

Reply via email to