Chris 

Thanks, but I tried it before without success.

I still get this on Drill 0.7. 

select t.entities.hashtags from dfs.twitter.`/nfl` t where 
t.entities.hashtags[0] is not null limit 10;

Error: exception while executing query: Failure while executing query. 
(state=,code=0)
Query failed: Query failed: Failure while running fragment., Failure while 
trying to materialize incoming schema.  Errors:

Error in expression at index 0.  Error: Missing function implementation: 
[isnotnull(MAP-REQUIRED)].  Full expression: null.. [ 
0dab71b6-1860-4739-b752-1458ed29b671 on drilldemo:31010 ]
[ 0dab71b6-1860-4739-b752-1458ed29b671 on drilldemo:31010 ]


Mine seems finicky about even using the first element in the array

select t.entities.hashtags[0] from dfs.twitter.`/nfl` t limit 10;

Query failed: Query failed: Failure while running fragment., index: 4, length: 
4 (expected: range(0, 4)) [ 62cd2d3a-546e-41c8-a0c5-4da2b81c2ef8 on 
drilldemo:31010 ]
[ 62cd2d3a-546e-41c8-a0c5-4da2b81c2ef8 on drilldemo:31010 ]


However flatten seems to work. (as long as you don’t have thousands of small 
json files)

select flatten(t.entities.hashtags) from dfs.twitter.`/nfl` t limit 10;

+------------+
|   EXPR$0   |
+------------+
| {"text":"PantherPride","indices":["12","25"]} |
| {"text":"KeepPounding","indices":["48","61"]} |
| {"text":"vegas","indices":["107","113"]} |
| {"text":"nfl","indices":["23","27"]} |
| {"text":"Cowboys","indices":["7","15"]} |
| {"text":"Cowboys","indices":["7","15"]} |
| {"text":"Ebay","indices":["52","57"]} |
| {"text":"NFL","indices":["58","62"]} |
| {"text":"Christmas","indices":["63","73"]} |
| {"text":"Gift","indices":["74","79"]} |
+------------+


—Andries




On Feb 5, 2015, at 6:41 PM, Christopher Matta <[email protected]> wrote:

> Andreas, I think I may have solved your issue:
> 
>> select t.entities.hashtags FROM mfs.cmatta.`tweets/blackfriday` t WHERE 
>> t.entities.hashtags[0].text is not null limit 10;
> +------------+
> |   EXPR$0   |
> +------------+
> | 
> [{"text":"Blackfriday","indices":[11,23]},{"text":"facebook","indices":[56,65]},{"text":"christmas","indices":[66,76]}]
> |
> | 
> [{"text":"BlackFriday","indices":[65,77]},{"text":"ViernesNegro","indices":[105,118]},{"text":"ofertas","indices":[122,130]}]
> |
> | 
> [{"text":"StarWars","indices":[21,30]},{"text":"BlackFriday","indices":[71,83]}]
> |
> | [{"text":"NotOneDime","indices":[14,25]}] |
> | [{"text":"BlackFriday","indices":[43,55]}] |
> | [{"text":"JRStudio","indices":[93,102]}] |
> | 
> [{"text":"luv","indices":[38,42]},{"text":"luvmanicures","indices":[43,56]},{"text":"luvpedicures","indices":[57,70]},{"text":"luvroyaloak","indices":[71,83]},{"text":"woodwardave","indices":[84,96]}]
> |
> | [{"text":"BlackFriday","indices":[0,12]}] |
> | 
> [{"text":"BlackFriday","indices":[18,30]},{"text":"SOE","indices":[65,69]},{"text":"DoubleSC","indices":[82,91]}]
> |
> | 
> [{"text":"BlackFriday","indices":[18,30]},{"text":"SOE","indices":[65,69]},{"text":"DoubleSC","indices":[82,91]}]
> |
> +------------+
> 10 rows selected (1.697 seconds)
> 
> I noticed when only extracting the first text item in the hashtag array (
> t.entities.hashtags[0].text) it was returning null for those tweets that
> didn’t have hashtags, so filtering out that seems to work.
> ​
> 
> Chris Matta
> [email protected]
> 215-701-3146
> 
> On Mon, Jan 26, 2015 at 6:43 PM, Jason Altekruse <[email protected]>
> wrote:
> 
>> As Aditya commented before this will work if the lists only contain scalars
>> 
>> repeated_count('entities.urls') > 0
>> 
>> If the lists contain maps unfortunately this is not available today. There
>> is an enhancement request open for this feature. I have marked it for a fix
>> in 0.9 as it is more of a feature request than a bug and we are working on
>> closing a large number of bugs for 0.8 before we get to issues like this.
>> 
>> https://issues.apache.org/jira/browse/DRILL-1650
>> 
>> -Jason
>> 
>> On Mon, Jan 26, 2015 at 3:26 PM, Andries Engelbrecht <
>> [email protected]> wrote:
>> 
>>> Unfortunately it seems that with larger data sets the use of flatten
>> seems
>>> to produce an error.
>>> 
>>> Any other options to filter out JSON records (objects) where an array is
>>> empty?
>>> 
>>> Thanks
>>> —Andries
>>> 
>>> 
>>> On Jan 21, 2015, at 5:18 PM, Andries Engelbrecht <
>>> [email protected]> wrote:
>>> 
>>>> It was interesting to see flatten bypass the records with an empty
>> array.
>>>> 
>>>> Unfortunately some arrays are much more complex than the example here,
>>> and it is still useful to have the ability to filter out ones that are
>>> empty.
>>>> 
>>>> —Andries
>>>> 
>>>> On Jan 21, 2015, at 5:04 PM, Hao Zhu <[email protected]> wrote:
>>>> 
>>>>> I figured out the differences after getting the json file from
>> Andries.
>>>>> 
>>>>> My json file is like:
>>>>> {“entities”:{xxx},“entities”:{yyy}... }
>>>>> 
>>>>> Andries' json file is like:
>>>>> {“entities”:{xxx}}
>>>>> {“entities”:{yyy}}
>>>>> ...
>>>>> 
>>>>> So basically Andries' json file is contains multiple json files.
>>>>> 
>>>>> Fortunately, we can use the same SQL to get the same results:
>>>>> 0: jdbc:drill:> select t.entities.hashtags from dfs.tmp.`z2.json` t;
>>>>> +------------+
>>>>> |   EXPR$0   |
>>>>> +------------+
>>>>> | []         |
>>>>> | [{"text":"GoPatriots"}] |
>>>>> | [{"text":"aaa"}] |
>>>>> | []         |
>>>>> | [{"text":"bbb"}] |
>>>>> +------------+
>>>>> 5 rows selected (0.136 seconds)
>>>>> 0: jdbc:drill:> with tmp as
>>>>> . . . . . . . > (
>>>>> . . . . . . . > select flatten(t.entities.hashtags) as c from
>>>>> dfs.tmp.`z2.json` t
>>>>> . . . . . . . > )
>>>>> . . . . . . . > select tmp.c.text from tmp;
>>>>> +------------+
>>>>> |   EXPR$0   |
>>>>> +------------+
>>>>> | GoPatriots |
>>>>> | aaa        |
>>>>> | bbb        |
>>>>> +------------+
>>>>> 3 rows selected (0.115 seconds)
>>>>> 
>>>>> Thanks,
>>>>> Hao
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Jan 21, 2015 at 4:34 PM, Andries Engelbrecht <
>>>>> [email protected]> wrote:
>>>>> 
>>>>>> In my case it returns the empty records when flatten is not used.
>>>>>> 
>>>>>> 0: jdbc:drill:zk=drilldemo:5181> select t.entities.hashtags as
>> hashtags
>>>>>> from `twitter.json` t limit 10;
>>>>>> +------------+
>>>>>> |  hashtags  |
>>>>>> +------------+
>>>>>> | []         |
>>>>>> | [{"text":"SportsNews","indices":[0,11]}] |
>>>>>> | []         |
>>>>>> | [{"text":"SportsNews","indices":[0,11]}] |
>>>>>> | []         |
>>>>>> | []         |
>>>>>> | []         |
>>>>>> | []         |
>>>>>> | []         |
>>>>>> | [{"text":"CARvsSEA","indices":[36,45]}] |
>>>>>> +------------+
>>>>>> 
>>>>>> On Jan 21, 2015, at 4:26 PM, Hao Zhu <[email protected]> wrote:
>>>>>> 
>>>>>>> Actually not due to flatten, if you directly query the file, it will
>>> only
>>>>>>> show the non-null values.
>>>>>>> 
>>>>>>> 0: jdbc:drill:> select t.entities.hashtags from dfs.tmp.`z.json` t;
>>>>>>> +------------+
>>>>>>> |   EXPR$0   |
>>>>>>> +------------+
>>>>>>> | [{"text":"GoPatriots"},{"text":"aaa"},{"text":"bbb"}] |
>>>>>>> +------------+
>>>>>>> 1 row selected (0.123 seconds)
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Hao
>>>>>>> 
>>>>>>> On Wed, Jan 21, 2015 at 4:18 PM, Andries Engelbrecht <
>>>>>>> [email protected]> wrote:
>>>>>>> 
>>>>>>>> Very interesting, flatten seems to bypass empty records. Not sure
>> if
>>>>>> that
>>>>>>>> is an ideal result for all use cases, but certainly usable in this
>>> case.
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> —Andries
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Jan 21, 2015, at 3:45 PM, Hao Zhu <[email protected]> wrote:
>>>>>>>> 
>>>>>>>>> I can also fetch non-null values for attached json file which
>>> contains
>>>>>> 6
>>>>>>>> "entities", 3 of them are null, 3 of them have "text" value.
>>>>>>>>> 
>>>>>>>>> Could you share your complete "twitter.json"?
>>>>>>>>> 
>>>>>>>>> 0: jdbc:drill:> with tmp as
>>>>>>>>> . . . . . . . > (
>>>>>>>>> . . . . . . . > select flatten(t.entities.hashtags) as c from
>>>>>>>> dfs.tmp.`z.json` t
>>>>>>>>> . . . . . . . > )
>>>>>>>>> . . . . . . . > select tmp.c.text from tmp;
>>>>>>>>> +------------+
>>>>>>>>> |   EXPR$0   |
>>>>>>>>> +------------+
>>>>>>>>> | GoPatriots |
>>>>>>>>> | aaa        |
>>>>>>>>> | bbb        |
>>>>>>>>> +------------+
>>>>>>>>> 3 rows selected (0.122 seconds)
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Hao
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wed, Jan 21, 2015 at 3:35 PM, Andries Engelbrecht <
>>>>>>>> [email protected]> wrote:
>>>>>>>>> The sample data I posted only has 1 element, but some records have
>>>>>>>> multiple elements in them.
>>>>>>>>> 
>>>>>>>>> Interestingly enough though
>>>>>>>>> select t.entities.hashtags[0].`text` from `twitter.json` t limit
>> 10;
>>>>>>>>> 
>>>>>>>>> Produces
>>>>>>>>> +------------+
>>>>>>>>> |   EXPR$0   |
>>>>>>>>> +------------+
>>>>>>>>> | null       |
>>>>>>>>> | SportsNews |
>>>>>>>>> | null       |
>>>>>>>>> | SportsNews |
>>>>>>>>> | null       |
>>>>>>>>> | null       |
>>>>>>>>> | null       |
>>>>>>>>> | null       |
>>>>>>>>> | null       |
>>>>>>>>> | CARvsSEA   |
>>>>>>>>> +——————+
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> And
>>>>>>>>> 
>>>>>>>>> select t.entities.hashtags[0] from `twitter.json` t limit 10;
>>>>>>>>> 
>>>>>>>>> +------------+
>>>>>>>>> |   EXPR$0   |
>>>>>>>>> +------------+
>>>>>>>>> | {"indices":[]} |
>>>>>>>>> | {"text":"SportsNews","indices":[90,99]} |
>>>>>>>>> | {"indices":[]} |
>>>>>>>>> | {"text":"SportsNews","indices":[90,99]} |
>>>>>>>>> | {"indices":[]} |
>>>>>>>>> | {"indices":[]} |
>>>>>>>>> | {"indices":[]} |
>>>>>>>>> | {"indices":[]} |
>>>>>>>>> | {"indices":[]} |
>>>>>>>>> | {"text":"CARvsSEA","indices":[90,99]} |
>>>>>>>>> +——————+
>>>>>>>>> 
>>>>>>>>> Strange part is that there is no indices map in the hashtags
>> array,
>>> so
>>>>>>>> no idea why it shows up when pointing to the first lament in an
>> empty
>>>>>> array.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> —Andries
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Jan 21, 2015, at 3:24 PM, Hao Zhu <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>>> I just noticed that the result of "hashtags" is just an array
>> with
>>>>>>>> only 1
>>>>>>>>>> element.
>>>>>>>>>> So take your example:
>>>>>>>>>> [root@maprdemo tmp]# cat d.json
>>>>>>>>>> {
>>>>>>>>>> "entities": {
>>>>>>>>>> "trends": [],
>>>>>>>>>> "symbols": [],
>>>>>>>>>> "urls": [],
>>>>>>>>>> "hashtags": [],
>>>>>>>>>> "user_mentions": []
>>>>>>>>>> },
>>>>>>>>>> "entities": {
>>>>>>>>>> "trends": [1,2,3],
>>>>>>>>>> "symbols": [4,5,6],
>>>>>>>>>> "urls": [7,8,9],
>>>>>>>>>> "hashtags": [
>>>>>>>>>> {
>>>>>>>>>>   "text": "GoPatriots",
>>>>>>>>>>   "indices": []
>>>>>>>>>> }
>>>>>>>>>> ],
>>>>>>>>>> "user_mentions": []
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>> 
>>>>>>>>>> Now we can do this to achieve the results:
>>>>>>>>>> 0: jdbc:drill:> select t.entities.hashtags from dfs.tmp.`d.json`
>> t
>>> ;
>>>>>>>>>> +------------+
>>>>>>>>>> |   EXPR$0   |
>>>>>>>>>> +------------+
>>>>>>>>>> | [{"text":"GoPatriots"}] |
>>>>>>>>>> +------------+
>>>>>>>>>> 1 row selected (0.09 seconds)
>>>>>>>>>> 0: jdbc:drill:> select t.entities.hashtags[0].text from
>>>>>>>> dfs.tmp.`d.json` t ;
>>>>>>>>>> +------------+
>>>>>>>>>> |   EXPR$0   |
>>>>>>>>>> +------------+
>>>>>>>>>> | GoPatriots |
>>>>>>>>>> +------------+
>>>>>>>>>> 1 row selected (0.108 seconds)
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Hao
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Wed, Jan 21, 2015 at 3:01 PM, Andries Engelbrecht <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>> 
>>>>>>>>>>> When I run the query on a larger dataset it actually show the
>>> empty
>>>>>>>>>>> records.
>>>>>>>>>>> 
>>>>>>>>>>> select t.entities.hashtags from `twitter.json` t limit 10;
>>>>>>>>>>> 
>>>>>>>>>>> +------------+
>>>>>>>>>>> |   EXPR$0   |
>>>>>>>>>>> +------------+
>>>>>>>>>>> | []         |
>>>>>>>>>>> | [{"text":"SportsNews","indices":[0,11]}] |
>>>>>>>>>>> | []         |
>>>>>>>>>>> | [{"text":"SportsNews","indices":[0,11]}] |
>>>>>>>>>>> | []         |
>>>>>>>>>>> | []         |
>>>>>>>>>>> | []         |
>>>>>>>>>>> | []         |
>>>>>>>>>>> | []         |
>>>>>>>>>>> | [{"text":"CARvsSEA","indices":[36,45]}] |
>>>>>>>>>>> +------------+
>>>>>>>>>>> 10 rows selected (2.899 seconds)
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> However having the output as maps is not very useful, unless i
>> can
>>>>>>>> filter
>>>>>>>>>>> out the records with empty arrays and then drill deeper into the
>>> ones
>>>>>>>> with
>>>>>>>>>>> data in the arrays.
>>>>>>>>>>> 
>>>>>>>>>>> BTW: Hao I would have expiated your query to return both rows,
>> one
>>>>>>>> with an
>>>>>>>>>>> empty array as above and the other with the array data.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> —Andries
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Jan 21, 2015, at 2:56 PM, Hao Zhu <[email protected]> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> I am not sure if below is expected behavior.
>>>>>>>>>>>> If we only select "hashtags", and it will return only 1 row
>>> ignoring
>>>>>>>> the
>>>>>>>>>>>> null value.
>>>>>>>>>>>> However then if we try to get "hashtags.text", it fails...which
>>>>>>>> means it
>>>>>>>>>>> is
>>>>>>>>>>>> still trying to read the NULL value.
>>>>>>>>>>>> I am thinking it may confuse the SQL developers.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 0: jdbc:drill:> select t.entities.hashtags from
>> dfs.tmp.`d.json`
>>> t ;
>>>>>>>>>>>> +------------+
>>>>>>>>>>>> |   EXPR$0   |
>>>>>>>>>>>> +------------+
>>>>>>>>>>>> | [{"text":"GoPatriots"}] |
>>>>>>>>>>>> +------------+
>>>>>>>>>>>> 1 row selected (0.109 seconds)
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 0: jdbc:drill:> select t.entities.hashtags.text from
>>>>>>>> dfs.tmp.`d.json` t ;
>>>>>>>>>>>> 
>>>>>>>>>>>> Query failed: Query failed: Failure while running fragment.,
>>>>>>>>>>>> org.apache.drill.exec.vector.complex.RepeatedMapVector cannot
>> be
>>>>>>>> cast to
>>>>>>>>>>>> org.apache.drill.exec.vector.complex.MapVector [
>>>>>>>>>>>> 7ab63d4e-8a1d-4e23-8853-a879db7e8a5f on maprdemo:31010 ]
>>>>>>>>>>>> [ 7ab63d4e-8a1d-4e23-8853-a879db7e8a5f on maprdemo:31010 ]
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Error: exception while executing query: Failure while executing
>>>>>>>> query.
>>>>>>>>>>>> (state=,code=0)
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Hao
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, Jan 21, 2015 at 2:43 PM, Andries Engelbrecht <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Now try on hashtags with the following:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> drilldemo:5181> select t.entities.hashtags.`text` from
>>>>>>>> `/twitter.json` t
>>>>>>>>>>>>> where t.entities.hashtags is not null limit 10;
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Query failed: Query failed: Failure while running fragment.,
>>>>>>>>>>>>> org.apache.drill.exec.vector.complex.RepeatedMapVector cannot
>> be
>>>>>>>> cast to
>>>>>>>>>>>>> org.apache.drill.exec.vector.complex.MapVector [
>>>>>>>>>>>>> 6fe7f918-d1a7-4fc6-b24d-44ff9186f59e on drilldemo:31010 ]
>>>>>>>>>>>>> [ 6fe7f918-d1a7-4fc6-b24d-44ff9186f59e on drilldemo:31010 ]
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Error: exception while executing query: Failure while
>> executing
>>>>>>>> query.
>>>>>>>>>>>>> (state=,code=0)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> {
>>>>>>>>>>>>> "entities": {
>>>>>>>>>>>>> "trends": [],
>>>>>>>>>>>>> "symbols": [],
>>>>>>>>>>>>> "urls": [],
>>>>>>>>>>>>> "hashtags": [],
>>>>>>>>>>>>> "user_mentions": []
>>>>>>>>>>>>> },
>>>>>>>>>>>>> "entities": {
>>>>>>>>>>>>> "trends": [1,2,3],
>>>>>>>>>>>>> "symbols": [4,5,6],
>>>>>>>>>>>>> "urls": [7,8,9],
>>>>>>>>>>>>> "hashtags": [
>>>>>>>>>>>>> {
>>>>>>>>>>>>>  "text": "GoPatriots",
>>>>>>>>>>>>>  "indices": []
>>>>>>>>>>>>> }
>>>>>>>>>>>>> ],
>>>>>>>>>>>>> "user_mentions": []
>>>>>>>>>>>>> }
>>>>>>>>>>>>> }
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The issue seems to be that if some records have arrays with
>>> maps in
>>>>>>>> them
>>>>>>>>>>>>> and others are empty.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> —Andries
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Jan 21, 2015, at 2:34 PM, Hao Zhu <[email protected]>
>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Seems it works for below json file:
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>> "entities": {
>>>>>>>>>>>>>> "trends": [],
>>>>>>>>>>>>>> "symbols": [],
>>>>>>>>>>>>>> "urls": [],
>>>>>>>>>>>>>> "hashtags": [
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>  "text": "GoPatriots",
>>>>>>>>>>>>>>  "indices": [
>>>>>>>>>>>>>>    83,
>>>>>>>>>>>>>>    94
>>>>>>>>>>>>>>  ]
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>> ],
>>>>>>>>>>>>>> "user_mentions": []
>>>>>>>>>>>>>> },
>>>>>>>>>>>>>> "entities": {
>>>>>>>>>>>>>> "trends": [1,2,3],
>>>>>>>>>>>>>> "symbols": [4,5,6],
>>>>>>>>>>>>>> "urls": [7,8,9],
>>>>>>>>>>>>>> "hashtags": [
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>  "text": "GoPatriots",
>>>>>>>>>>>>>>  "indices": []
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>> ],
>>>>>>>>>>>>>> "user_mentions": []
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 0: jdbc:drill:> select t.entities.urls from dfs.tmp.`a.json`
>>> as t
>>>>>>>> where
>>>>>>>>>>>>>> t.entities.urls is not null;
>>>>>>>>>>>>>> +------------+
>>>>>>>>>>>>>> |   EXPR$0   |
>>>>>>>>>>>>>> +------------+
>>>>>>>>>>>>>> | [7,8,9]    |
>>>>>>>>>>>>>> +------------+
>>>>>>>>>>>>>> 1 row selected (0.139 seconds)
>>>>>>>>>>>>>> 0: jdbc:drill:> select t.entities.urls from dfs.tmp.`a.json`
>>> as t
>>>>>>>> where
>>>>>>>>>>>>>> t.entities.urls is null;
>>>>>>>>>>>>>> +------------+
>>>>>>>>>>>>>> |   EXPR$0   |
>>>>>>>>>>>>>> +------------+
>>>>>>>>>>>>>> +------------+
>>>>>>>>>>>>>> No rows selected (0.158 seconds)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Hao
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Wed, Jan 21, 2015 at 2:01 PM, Aditya <
>>> [email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I believe that this works if the array contains homogeneous
>>>>>>>> primitive
>>>>>>>>>>>>>>> types. In your example, it appears from the error, the array
>>>>>> field
>>>>>>>>>>>>> 'member'
>>>>>>>>>>>>>>> contained maps for at least one record.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Wed, Jan 21, 2015 at 1:57 PM, Christopher Matta <
>>>>>>>> [email protected]>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Trying that locally did not work for me (drill 0.7.0):
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 0: jdbc:drill:zk=local> select `id`, `name`, `members` from
>>>>>>>>>>>>>>> `Downloads/test.json` where repeated_count(`members`) > 0;
>>>>>>>>>>>>>>>> Query failed: Query stopped., Failure while trying to
>>>>>> materialize
>>>>>>>>>>>>>>> incoming schema.  Errors:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Error in expression at index -1.  Error: Missing function
>>>>>>>>>>>>>>> implementation: [repeated_count(MAP-REPEATED)].  Full
>>> expression:
>>>>>>>>>>>>> --UNKNOWN
>>>>>>>>>>>>>>> EXPRESSION--.. [ 47142fa4-7e6a-48cb-be6a-676e885ede11 on
>>>>>>>>>>>>> bullseye-3:31010 ]
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Error: exception while executing query: Failure while
>>> executing
>>>>>>>>>>> query.
>>>>>>>>>>>>>>> (state=,code=0)
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> ​
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Chris Matta
>>>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>>> 215-701-3146
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Wed, Jan 21, 2015 at 4:50 PM, Aditya <
>>>>>> [email protected]
>>>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> repeated_count('entities.urls') > 0
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Wed, Jan 21, 2015 at 1:46 PM, Andries Engelbrecht <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> How do you filter out records with an empty array in
>> drill?
>>>>>>>>>>>>>>>>>> i.e some records have "url":[]  and some will have an
>> array
>>>>>>>> with
>>>>>>>>>>> data
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> it. When trying to read records with data in the array
>>> drill
>>>>>>>> fails
>>>>>>>>>>>>> due
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> records missing any data in the array. Trying a filter
>>> with/*
>>>>>>>> where
>>>>>>>>>>>>>>>>>> "url":[0] is not null */ fails, also fails if applying
>> url
>>> is
>>>>>>>> not
>>>>>>>>>>>>>>> null.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Note some of the arrays contains maps, using twitter data
>>> as
>>>>>> an
>>>>>>>>>>>>>>> example
>>>>>>>>>>>>>>>>>> below. Some records have an empty array with
>> “hashtags”:[]
>>>>>> and
>>>>>>>>>>>>> others
>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>> look similar to what is listed below.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> "entities": {
>>>>>>>>>>>>>>>>>> "trends": [],
>>>>>>>>>>>>>>>>>> "symbols": [],
>>>>>>>>>>>>>>>>>> "urls": [],
>>>>>>>>>>>>>>>>>> "hashtags": [
>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>  "text": "GoPatriots",
>>>>>>>>>>>>>>>>>>  "indices": [
>>>>>>>>>>>>>>>>>>    83,
>>>>>>>>>>>>>>>>>>    94
>>>>>>>>>>>>>>>>>>  ]
>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>> ],
>>>>>>>>>>>>>>>>>> "user_mentions": []
>>>>>>>>>>>>>>>>>> },
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>> —Andries
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> <z.json>
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
>> 

Reply via email to