I am not sure if below is expected behavior.
If we only select "hashtags", and it will return only 1 row ignoring the
null value.
However then if we try to get "hashtags.text", it fails...which means it is
still trying to read the NULL value.
I am thinking it may confuse the SQL developers.


0: jdbc:drill:> select t.entities.hashtags from dfs.tmp.`d.json` t ;
+------------+
|   EXPR$0   |
+------------+
| [{"text":"GoPatriots"}] |
+------------+
1 row selected (0.109 seconds)


0: jdbc:drill:> select t.entities.hashtags.text from dfs.tmp.`d.json` t ;

Query failed: Query failed: Failure while running fragment.,
org.apache.drill.exec.vector.complex.RepeatedMapVector cannot be cast to
org.apache.drill.exec.vector.complex.MapVector [
7ab63d4e-8a1d-4e23-8853-a879db7e8a5f on maprdemo:31010 ]
[ 7ab63d4e-8a1d-4e23-8853-a879db7e8a5f on maprdemo:31010 ]


Error: exception while executing query: Failure while executing query.
(state=,code=0)

Thanks,
Hao

On Wed, Jan 21, 2015 at 2:43 PM, Andries Engelbrecht <
[email protected]> wrote:

> Now try on hashtags with the following:
>
> drilldemo:5181> select t.entities.hashtags.`text` from `/twitter.json` t
> where t.entities.hashtags is not null limit 10;
>
> Query failed: Query failed: Failure while running fragment.,
> org.apache.drill.exec.vector.complex.RepeatedMapVector cannot be cast to
> org.apache.drill.exec.vector.complex.MapVector [
> 6fe7f918-d1a7-4fc6-b24d-44ff9186f59e on drilldemo:31010 ]
> [ 6fe7f918-d1a7-4fc6-b24d-44ff9186f59e on drilldemo:31010 ]
>
>
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
>
>
> {
> "entities": {
>    "trends": [],
>    "symbols": [],
>    "urls": [],
>    "hashtags": [],
>    "user_mentions": []
>  },
> "entities": {
>    "trends": [1,2,3],
>    "symbols": [4,5,6],
>    "urls": [7,8,9],
>    "hashtags": [
>      {
>        "text": "GoPatriots",
>        "indices": []
>      }
>    ],
>    "user_mentions": []
>  }
> }
>
> The issue seems to be that if some records have arrays with maps in them
> and others are empty.
>
> —Andries
>
>
> On Jan 21, 2015, at 2:34 PM, Hao Zhu <[email protected]> wrote:
>
> > Seems it works for below json file:
> > {
> > "entities": {
> >    "trends": [],
> >    "symbols": [],
> >    "urls": [],
> >    "hashtags": [
> >      {
> >        "text": "GoPatriots",
> >        "indices": [
> >          83,
> >          94
> >        ]
> >      }
> >    ],
> >    "user_mentions": []
> >  },
> > "entities": {
> >    "trends": [1,2,3],
> >    "symbols": [4,5,6],
> >    "urls": [7,8,9],
> >    "hashtags": [
> >      {
> >        "text": "GoPatriots",
> >        "indices": []
> >      }
> >    ],
> >    "user_mentions": []
> >  }
> > }
> >
> >
> > 0: jdbc:drill:> select t.entities.urls from dfs.tmp.`a.json` as t where
> > t.entities.urls is not null;
> > +------------+
> > |   EXPR$0   |
> > +------------+
> > | [7,8,9]    |
> > +------------+
> > 1 row selected (0.139 seconds)
> > 0: jdbc:drill:> select t.entities.urls from dfs.tmp.`a.json` as t where
> > t.entities.urls is null;
> > +------------+
> > |   EXPR$0   |
> > +------------+
> > +------------+
> > No rows selected (0.158 seconds)
> >
> > Thanks,
> > Hao
> >
> > On Wed, Jan 21, 2015 at 2:01 PM, Aditya <[email protected]> wrote:
> >
> >> I believe that this works if the array contains homogeneous primitive
> >> types. In your example, it appears from the error, the array field
> 'member'
> >> contained maps for at least one record.
> >>
> >> On Wed, Jan 21, 2015 at 1:57 PM, Christopher Matta <[email protected]>
> >> wrote:
> >>
> >>> Trying that locally did not work for me (drill 0.7.0):
> >>>
> >>> 0: jdbc:drill:zk=local> select `id`, `name`, `members` from
> >> `Downloads/test.json` where repeated_count(`members`) > 0;
> >>> Query failed: Query stopped., Failure while trying to materialize
> >> incoming schema.  Errors:
> >>>
> >>> Error in expression at index -1.  Error: Missing function
> >> implementation: [repeated_count(MAP-REPEATED)].  Full expression:
> --UNKNOWN
> >> EXPRESSION--.. [ 47142fa4-7e6a-48cb-be6a-676e885ede11 on
> bullseye-3:31010 ]
> >>>
> >>> Error: exception while executing query: Failure while executing query.
> >> (state=,code=0)
> >>>
> >>> ​
> >>>
> >>> Chris Matta
> >>> [email protected]
> >>> 215-701-3146
> >>>
> >>> On Wed, Jan 21, 2015 at 4:50 PM, Aditya <[email protected]>
> wrote:
> >>>
> >>>> repeated_count('entities.urls') > 0
> >>>>
> >>>> On Wed, Jan 21, 2015 at 1:46 PM, Andries Engelbrecht <
> >>>> [email protected]> wrote:
> >>>>
> >>>>> How do you filter out records with an empty array in drill?
> >>>>> i.e some records have "url":[]  and some will have an array with data
> >> in
> >>>>> it. When trying to read records with data in the array drill fails
> due
> >>>> to
> >>>>> records missing any data in the array. Trying a filter with/* where
> >>>>> "url":[0] is not null */ fails, also fails if applying url is not
> >> null.
> >>>>>
> >>>>> Note some of the arrays contains maps, using twitter data as an
> >> example
> >>>>> below. Some records have an empty array with “hashtags”:[]  and
> others
> >>>> will
> >>>>> look similar to what is listed below.
> >>>>>
> >>>>> "entities": {
> >>>>>    "trends": [],
> >>>>>    "symbols": [],
> >>>>>    "urls": [],
> >>>>>    "hashtags": [
> >>>>>      {
> >>>>>        "text": "GoPatriots",
> >>>>>        "indices": [
> >>>>>          83,
> >>>>>          94
> >>>>>        ]
> >>>>>      }
> >>>>>    ],
> >>>>>    "user_mentions": []
> >>>>>  },
> >>>>>
> >>>>>
> >>>>> Thanks
> >>>>> —Andries
> >>>>
> >>>
> >>>
> >>
>
>

Reply via email to