It is a good work around. Thanks a lot Andries.
On Feb 3, 2015, at 3:11 PM, Andries Engelbrecht <[email protected]> wrote: > Sudhakar, > > Use an outer join for your desired outcome. As discussed flatten behaves per > spec, but the spec may need some discussion. > > Try this for a workaround > > select q.a, q.b, q.c.x, q.c.y, zz.flat from dfs.twitter.`sudhakar.json` q > left join (select t.a, flatten(t.c.z) as flat from > dfs.twitter.`sudhakar.json` t) zz on (q.a=zz.a); > > > —Andries > > > > On Feb 3, 2015, at 2:53 PM, Sudhakar Thota <[email protected]> wrote: > >> Jason, >> >> Thank you. >> >> I was testing this on customer data and found this limitation. >> As I cannot put out the customer data, I have used some test data to verify >> it myself. >> >> Please let me know if there is any work around to get nulls or empty spaces >> wherever the array is empty. >> >> Steven, >> >> I remember using the similar one on mongo, I don’t remember that eliminating >> the records. But I can check. I haven’t used Big Query for this, I used it >> for flat tables, but I can check. >> >> Thanks >> Sudhakar Thota >> >> >> >> On Feb 3, 2015, at 2:45 PM, Jason Altekruse <[email protected]> wrote: >> >>> Sudhakar opened an issue for this, so I responded there. Steven is right, >>> this is the current expected functionality, but I discuss there the reasons >>> for it and opened the discussion for use cases that need this functionality. >>> >>> On Tue, Feb 3, 2015 at 2:28 PM, Steven Phillips <[email protected]> >>> wrote: >>> >>>> I think flatten is functioning as designed, but perhaps we need to treat >>>> this case differently. It might make sense to return a single row with a >>>> null for the flattened column in this case. The only model for "flatten" >>>> that I know of is google Big Query. We should see how they handle this >>>> case. >>>> >>>> On Tue, Feb 3, 2015 at 12:27 PM, MapR <[email protected]> wrote: >>>> >>>>> In a table, column with null values we >>>>> present as nulls or empty values with other columns. I never had an >>>>> experience with c1, c2, c3 (columns ) not presented when c3 has null >>>> values >>>>> even though c1, c2 have notnull values. >>>>> >>>>> Sudhakar Thota >>>>> Sent from my iPhone >>>>> >>>>>> On Feb 3, 2015, at 11:57 AM, Andries Engelbrecht < >>>>> [email protected]> wrote: >>>>>> >>>>>> In listing 3 it looks like flatten is doing the correct thing by >>>>> representing each element in the array with a new row and the element in >>>>> the column output. >>>>>> And also for arrays that are empty there is no output row. >>>>>> >>>>>> —Andries >>>>>> >>>>>>> On Feb 3, 2015, at 11:54 AM, Sudhakar Thota <[email protected]> >>>>> wrote: >>>>>>> >>>>>>> It will be more appropriate if I add my file to have data to >>>> facilitate >>>>> quick test. >>>>>>> >>>>>>> [root@maprdemo clicks]# cat sthota_test_1.json >>>>>>> {"a":"r1cl1", "b":"r1c2","c":{ "x" : 1, "y" : "a string", "z" : [ 1] >>>> }} >>>>>>> {"a":"r2cl1", "b":"r2c2","c":{ "x" : 2, "y" : "a string", "z" : [ 1,2] >>>>> }} >>>>>>> {"a":"r3cl1", "b":"r3c2","c":{ "x" : 3, "y" : "a string", "z" : [ >>>>> 1,2,3] }} >>>>>>> {"a":"r4cl1", "b":"r4c2","c":{ "x" : 4, "y" : "a string", "z" : [ >>>>> 1,2,3,4] }} >>>>>>> {"a":"r5cl1", "b":"r5c2","c":{ "x" : 5, "y" : "a string", "z" : [ >>>>> 1,2,3,4,5] }} >>>>>>> {"a":"r6cl1", "b":"r6c2","c":{ "x" : 6, "y" : "a string" }} >>>>>>> >>>>>>> >>>>>>> Begin forwarded message: >>>>>>> >>>>>>>> From: Sudhakar Thota <[email protected]> >>>>>>>> Subject: Drill - Flatten function - help please >>>>>>>> Date: February 3, 2015 at 11:52:39 AM PST >>>>>>>> To: [email protected] >>>>>>>> >>>>>>>> Hi Drillers, >>>>>>>> >>>>>>>> I have a request here about flatten function. I want to know if this >>>>> is an expected behavior of drill flatten function with json documents. >>>>>>>> >>>>>>>> Listing-1 is contents of my json collection where I have >>>> intentionally >>>>> included a empty array. >>>>>>>> Listing-2 shows that if I don’t call flatten function it shows all >>>>> records. >>>>>>>> Listing-3 shows that if I call flatten function, row6 (r6cl1…) is >>>>> ignored. I am using that in my projections not in predicate, but it looks >>>>> like treated as predicate. >>>>>>>> >>>>>>>> Please correct me if I am wrong and let me know if there is any work >>>>> around. >>>>>>>> >>>>>>>> Listing:1 >>>>>>>> >>>>>>>> 0: jdbc:drill:> SELECT t.a,t.b,t.c.x, t.c.y, t.c.z from >>>>> dfs.`/data/nested/clicks/sthota_test_1.json` as t; >>>>>>>> +------------+------------+------------+------------+------------+ >>>>>>>> | a | b | EXPR$2 | EXPR$3 | EXPR$4 | >>>>>>>> +------------+------------+------------+------------+------------+ >>>>>>>> | r1cl1 | r1c2 | 1 | a string | ["1"] | >>>>>>>> | r2cl1 | r2c2 | 2 | a string | ["1","2"] | >>>>>>>> | r3cl1 | r3c2 | 3 | a string | ["1","2","3"] | >>>>>>>> | r4cl1 | r4c2 | 4 | a string | >>>>> ["1","2","3","4"] | >>>>>>>> | r5cl1 | r5c2 | 5 | a string | >>>>> ["1","2","3","4","5"] | >>>>>>>> | r6cl1 | r6c2 | 6 | a string | [] | >>>>>>>> +------------+------------+------------+------------+------------+ >>>>>>>> 6 rows selected (0.125 seconds) >>>>>>>> 0: jdbc:drill:> >>>>>>>> >>>>>>>> Listing:2 >>>>>>>> >>>>>>>> 0: jdbc:drill:> SELECT t.a,t.b,t.c.x, t.c.y from >>>>> dfs.`/data/nested/clicks/sthota_test_1.json` as t; >>>>>>>> +------------+------------+------------+------------+ >>>>>>>> | a | b | EXPR$2 | EXPR$3 | >>>>>>>> +------------+------------+------------+------------+ >>>>>>>> | r1cl1 | r1c2 | 1 | a string | >>>>>>>> | r2cl1 | r2c2 | 2 | a string | >>>>>>>> | r3cl1 | r3c2 | 3 | a string | >>>>>>>> | r4cl1 | r4c2 | 4 | a string | >>>>>>>> | r5cl1 | r5c2 | 5 | a string | >>>>>>>> | r6cl1 | r6c2 | 6 | a string | >>>>>>>> +------------+------------+------------+------------+ >>>>>>>> 6 rows selected (0.163 seconds) >>>>>>>> >>>>>>>> Listing:3 >>>>>>>> >>>>>>>> 0: jdbc:drill:> SELECT t.a,t.b,t.c.x, t.c.y, flatten(t.c.z) from >>>>> dfs.`/data/nested/clicks/sthota_test_1.json` as t; >>>>>>>> +------------+------------+------------+------------+------------+ >>>>>>>> | a | b | EXPR$2 | EXPR$3 | EXPR$4 | >>>>>>>> +------------+------------+------------+------------+------------+ >>>>>>>> | r1cl1 | r1c2 | 1 | a string | 1 | >>>>>>>> | r2cl1 | r2c2 | 2 | a string | 1 | >>>>>>>> | r2cl1 | r2c2 | 2 | a string | 2 | >>>>>>>> | r3cl1 | r3c2 | 3 | a string | 1 | >>>>>>>> | r3cl1 | r3c2 | 3 | a string | 2 | >>>>>>>> | r3cl1 | r3c2 | 3 | a string | 3 | >>>>>>>> | r4cl1 | r4c2 | 4 | a string | 1 | >>>>>>>> | r4cl1 | r4c2 | 4 | a string | 2 | >>>>>>>> | r4cl1 | r4c2 | 4 | a string | 3 | >>>>>>>> | r4cl1 | r4c2 | 4 | a string | 4 | >>>>>>>> | r5cl1 | r5c2 | 5 | a string | 1 | >>>>>>>> | r5cl1 | r5c2 | 5 | a string | 2 | >>>>>>>> | r5cl1 | r5c2 | 5 | a string | 3 | >>>>>>>> | r5cl1 | r5c2 | 5 | a string | 4 | >>>>>>>> | r5cl1 | r5c2 | 5 | a string | 5 | >>>>>>>> +------------+------------+------------+------------+------------+ >>>>>>>> 15 rows selected (0.171 seconds) >>>>>>>> >>>>>>>> Thanks >>>>>>>> Sudhakar Thota >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Steven Phillips >>>> Software Engineer >>>> >>>> mapr.com >>>> >> >
