Use where a[0] is not null 0: jdbc:drill:> select * from `./array.json`; +----+--------+ | b | a | +----+--------+ | 1 | [] | | 3 | [1,2] | +----+--------+ 2 rows selected (0.13 seconds) 0: jdbc:drill:> select * from `./array.json` where a[0] is not null; +----+--------+ | b | a | +----+--------+ | 3 | [1,2] | +----+--------+ 1 row selected (0.151 seconds)
—Andries > On Oct 19, 2015, at 12:32 PM, John Omernik <[email protected]> wrote: > > Well you are in a sense confirming my suspicions that an empty array, as > specified in the Docs as "error causing" doesn't actually cause an error, > and that is expected. That is, empty arrays are not the big meanies that > the docs make them out to be (my results are the same as your, that is, no > errors). > > I like the flatten approach, but is there a simple way to say select * from > dfs.tdunning.`x.json` where > > a is not empty > > or > > size(a) == 0 > > or > > a != [] > > > I guess some functions for working with arrays would be handy. I'll play > with flatten to see if it gives me what I am looking for, but are there > other ways to play with arrays (now that I confirm that empty arrays aren't > evil) > > John > > > On Mon, Oct 19, 2015 at 1:40 PM, Ted Dunning <[email protected]> wrote: > >> John, >> >> I don't understand what you are seeing. Here is what I am seeing (and >> hopefully you can tell what I am missing). >> >> First the input is: >> >> $ cat x.json >> {"b":1, "a":[] } >> {"a":[1,2], "b":3} >> >> And then with this input, I get this: >> >> 0: jdbc:drill:> select * from dfs.tdunning.`x.json`; >> +----+------------+ >> | b | a | >> +----+------------+ >> | 1 | [] | >> | 3 | ["1","2"] | >> +----+------------+ >> 2 rows selected (0.443 seconds) >> 0: jdbc:drill:> select a,b from dfs.tdunning.`x.json`; >> +------------+----+ >> | a | b | >> +------------+----+ >> | [] | 1 | >> | ["1","2"] | 3 | >> +------------+----+ >> 2 rows selected (0.473 seconds) >> 0: jdbc:drill:> select flatten(a),b from dfs.tdunning.`x.json`; >> +---------+----+ >> | EXPR$0 | b | >> +---------+----+ >> | 1 | 3 | >> | 2 | 3 | >> +---------+----+ >> 2 rows selected (0.499 seconds) >> >> >> On Mon, Oct 19, 2015 at 7:03 AM, John Omernik <[email protected]> wrote: >> >>> In https://drill.apache.org/docs/json-data-model/ there is a section >> that >>> goes as laid out below. This is actually not occurring for me. I have a >>> json dump from Mongo that has a field called tags where many records have >>> "tags":[] and it's outputting that without error. (It just shows [] as >> the >>> output). >>> >>> So, my question is this... based on the documentation, what I am seeing >> is >>> NOT expected, is it a miss on the docs, or something that is fixed in the >>> 1.2 release that I have? >>> >>> If it is fixed so we can have empty arrays in a field like tags, is it >>> possible there are some functions I can use to determine if that field is >>> empty? i.e. if isemptyarray(tags) returns true if empty or perhaps get me >>> the length said array? These functions would be very valuable in >> queries >>> (if the empty arrays thing is not a weird quirk I am seeing). >>> >>> Empty array >>> >>> Drill cannot read an empty array, shown in the following example, and >>> attempting to do so causes an error. >>> >>> { "a":[] } >>> >>> Workaround: Remove empty arrays. >>> >>
