+1
We had a long discussion on this topic on the dev list a month or so. The
conclusion seemed to be that Drill is intended to pull schema out of data
without an external schema; the data must support Drill's schema inference.
There remain holes where knowing the schema up front would be a huge win. For
now, the solution is to ETL to Parquet, which does carry a schema.
Thanks,
- Paul
On Thursday, May 31, 2018, 8:25:00 AM PDT, Lee, David
<[email protected]> wrote:
I think I opened an enhancement ticket to pass in a json schema object to a
query to bypass schema learning to avoid problems like this. Coordinates could
be typed as float in a schema object so drill can cast it to float without
converting everything to doubles.
It also addresses the issues if some key value is NULL in the entire file.
Drill will cast NULL to an int which results in a schema error if the next file
read has non-Null string values.
Turning on read everything as string is a hack and even that fails once you
start hitting Null key values which are nested keys or nested arrays.
An alternative short term solution would be to not include NULLs.
Sent from my iPad
> On May 31, 2018, at 2:23 AM, Divya Gehlot <[email protected]> wrote:
>
> [EXTERNAL EMAIL]
>
>
> I tried exec.enable_union_type it didnt work for me ,however below helped :
>
> ALTER SESSION SET `store.json.read_numbers_as_double` = true;
>
>
>> On 31 May 2018 at 11:28, Padma Penumarthy <[email protected]> wrote:
>>
>> yes, that is correct.
>> You can try setting the option “exec.enable_union_type” for that to work
>> with the caveat that
>> union type is not fully supported in drill.
>>
>> Thanks
>> Padma
>>
>>
>>> On May 30, 2018, at 7:56 PM, Divya Gehlot <[email protected]>
>> wrote:
>>>
>>> Hi,
>>> I am reading a complex json file, I am getting format doesn't support
>> while
>>> reading below :
>>> "Coordinates":[
>>> [
>>> 23.53,
>>> 4.99,
>>> 11
>>> ],
>>> [
>>> 35.09,
>>> 7.7,
>>> 16
>>> ]
>>> ]
>>>
>>>
>>> Error : Query execution error. Details:[
>>>> UNSUPPORTED_OPERATION ERROR: In a list of type FLOAT8, encountered a
>> value
>>>> of type BIGINT. Drill does not support lists of different types.
>>>> Line 15
>>>> Column 19
>>>> Field Coordinates
>>>> Line 15
>>>> Column 19
>>>> Field Coordinates
>>>> Line 15
>>>> Column 19
>>>> Field Coordinates
>>>> Fragment 0:0
>>>
>>>
>>> If I remove the third coordinates(11,16) which is integer it works like
>>> charm .
>>>
>>> Does that means Drill doesn't support values of different data types in
>>> array list?
>>>
>>> Appreciate the help !
>>>
>>> Thanks,
>>> Divya
>>
>>
This message may contain information that is confidential or privileged. If you
are not the intended recipient, please advise the sender immediately and delete
this message. See
http://www.blackrock.com/corporate/en-us/compliance/email-disclaimers for
further information. Please refer to
http://www.blackrock.com/corporate/en-us/compliance/privacy-policy for more
information about BlackRock’s Privacy Policy.
For a list of BlackRock's office addresses worldwide, see
http://www.blackrock.com/corporate/en-us/about-us/contacts-locations.
© 2018 BlackRock, Inc. All rights reserved.