I get the same error as Chris when trying to use repeated_count. The issue is that some arrays will have maps/nested maps, and some will be empty.
Thus if we take the twitter example trying to retrieve entities.hashtags.text will fail due to some some entities.hashtags begin empty. Using null detection fails as the array hashtags is empty for some records. If the repeated_count enhancement that Jason refers to should be able to resolve this, or a simpler alternative of having a null or other function to filter out empty arrays at any level in the JSON tree. —Andries On Jan 21, 2015, at 2:01 PM, Aditya <[email protected]> wrote: > I believe that this works if the array contains homogeneous primitive > types. In your example, it appears from the error, the array field 'member' > contained maps for at least one record. > > On Wed, Jan 21, 2015 at 1:57 PM, Christopher Matta <[email protected]> wrote: > >> Trying that locally did not work for me (drill 0.7.0): >> >> 0: jdbc:drill:zk=local> select `id`, `name`, `members` from >> `Downloads/test.json` where repeated_count(`members`) > 0; >> Query failed: Query stopped., Failure while trying to materialize incoming >> schema. Errors: >> >> Error in expression at index -1. Error: Missing function implementation: >> [repeated_count(MAP-REPEATED)]. Full expression: --UNKNOWN EXPRESSION--.. [ >> 47142fa4-7e6a-48cb-be6a-676e885ede11 on bullseye-3:31010 ] >> >> Error: exception while executing query: Failure while executing query. >> (state=,code=0) >> >> >> >> Chris Matta >> [email protected] >> 215-701-3146 >> >> On Wed, Jan 21, 2015 at 4:50 PM, Aditya <[email protected]> wrote: >> >>> repeated_count('entities.urls') > 0 >>> >>> On Wed, Jan 21, 2015 at 1:46 PM, Andries Engelbrecht < >>> [email protected]> wrote: >>> >>>> How do you filter out records with an empty array in drill? >>>> i.e some records have "url":[] and some will have an array with data in >>>> it. When trying to read records with data in the array drill fails due >>> to >>>> records missing any data in the array. Trying a filter with/* where >>>> "url":[0] is not null */ fails, also fails if applying url is not null. >>>> >>>> Note some of the arrays contains maps, using twitter data as an example >>>> below. Some records have an empty array with “hashtags”:[] and others >>> will >>>> look similar to what is listed below. >>>> >>>> "entities": { >>>> "trends": [], >>>> "symbols": [], >>>> "urls": [], >>>> "hashtags": [ >>>> { >>>> "text": "GoPatriots", >>>> "indices": [ >>>> 83, >>>> 94 >>>> ] >>>> } >>>> ], >>>> "user_mentions": [] >>>> }, >>>> >>>> >>>> Thanks >>>> —Andries >>> >> >>
