Awesome that worked. *The documentation should probably be updated on the array stuff, it's not accurate as it pertains to empty arrays.
On Mon, Oct 19, 2015 at 2:52 PM, Andries Engelbrecht < [email protected]> wrote: > Use where a[0] is not null > > 0: jdbc:drill:> select * from `./array.json`; > +----+--------+ > | b | a | > +----+--------+ > | 1 | [] | > | 3 | [1,2] | > +----+--------+ > 2 rows selected (0.13 seconds) > 0: jdbc:drill:> select * from `./array.json` where a[0] is not null; > +----+--------+ > | b | a | > +----+--------+ > | 3 | [1,2] | > +----+--------+ > 1 row selected (0.151 seconds) > > —Andries > > > > On Oct 19, 2015, at 12:32 PM, John Omernik <[email protected]> wrote: > > > > Well you are in a sense confirming my suspicions that an empty array, as > > specified in the Docs as "error causing" doesn't actually cause an error, > > and that is expected. That is, empty arrays are not the big meanies that > > the docs make them out to be (my results are the same as your, that is, > no > > errors). > > > > I like the flatten approach, but is there a simple way to say select * > from > > dfs.tdunning.`x.json` where > > > > a is not empty > > > > or > > > > size(a) == 0 > > > > or > > > > a != [] > > > > > > I guess some functions for working with arrays would be handy. I'll play > > with flatten to see if it gives me what I am looking for, but are there > > other ways to play with arrays (now that I confirm that empty arrays > aren't > > evil) > > > > John > > > > > > On Mon, Oct 19, 2015 at 1:40 PM, Ted Dunning <[email protected]> > wrote: > > > >> John, > >> > >> I don't understand what you are seeing. Here is what I am seeing (and > >> hopefully you can tell what I am missing). > >> > >> First the input is: > >> > >> $ cat x.json > >> {"b":1, "a":[] } > >> {"a":[1,2], "b":3} > >> > >> And then with this input, I get this: > >> > >> 0: jdbc:drill:> select * from dfs.tdunning.`x.json`; > >> +----+------------+ > >> | b | a | > >> +----+------------+ > >> | 1 | [] | > >> | 3 | ["1","2"] | > >> +----+------------+ > >> 2 rows selected (0.443 seconds) > >> 0: jdbc:drill:> select a,b from dfs.tdunning.`x.json`; > >> +------------+----+ > >> | a | b | > >> +------------+----+ > >> | [] | 1 | > >> | ["1","2"] | 3 | > >> +------------+----+ > >> 2 rows selected (0.473 seconds) > >> 0: jdbc:drill:> select flatten(a),b from dfs.tdunning.`x.json`; > >> +---------+----+ > >> | EXPR$0 | b | > >> +---------+----+ > >> | 1 | 3 | > >> | 2 | 3 | > >> +---------+----+ > >> 2 rows selected (0.499 seconds) > >> > >> > >> On Mon, Oct 19, 2015 at 7:03 AM, John Omernik <[email protected]> wrote: > >> > >>> In https://drill.apache.org/docs/json-data-model/ there is a section > >> that > >>> goes as laid out below. This is actually not occurring for me. I > have a > >>> json dump from Mongo that has a field called tags where many records > have > >>> "tags":[] and it's outputting that without error. (It just shows [] as > >> the > >>> output). > >>> > >>> So, my question is this... based on the documentation, what I am seeing > >> is > >>> NOT expected, is it a miss on the docs, or something that is fixed in > the > >>> 1.2 release that I have? > >>> > >>> If it is fixed so we can have empty arrays in a field like tags, is it > >>> possible there are some functions I can use to determine if that field > is > >>> empty? i.e. if isemptyarray(tags) returns true if empty or perhaps get > me > >>> the length said array? These functions would be very valuable in > >> queries > >>> (if the empty arrays thing is not a weird quirk I am seeing). > >>> > >>> Empty array > >>> > >>> Drill cannot read an empty array, shown in the following example, and > >>> attempting to do so causes an error. > >>> > >>> { "a":[] } > >>> > >>> Workaround: Remove empty arrays. > >>> > >> > >
