It is a good work around. 

Thanks a lot Andries.


On Feb 3, 2015, at 3:11 PM, Andries Engelbrecht <[email protected]> 
wrote:

> Sudhakar,
> 
> Use an outer join for your desired outcome. As discussed flatten behaves per 
> spec, but the spec may need some discussion.
> 
> Try this for a workaround
> 
> select q.a, q.b, q.c.x, q.c.y, zz.flat from dfs.twitter.`sudhakar.json` q 
> left join (select t.a, flatten(t.c.z) as flat from 
> dfs.twitter.`sudhakar.json` t) zz on (q.a=zz.a);
> 
> 
> —Andries
> 
> 
> 
> On Feb 3, 2015, at 2:53 PM, Sudhakar Thota <[email protected]> wrote:
> 
>> Jason,
>> 
>> Thank you. 
>> 
>> I was testing this on customer data and found this limitation. 
>> As I cannot put out the customer data, I have used some test data to verify 
>> it myself. 
>> 
>> Please let me know if there is any work around to get nulls or empty spaces 
>> wherever the array is empty.
>> 
>> Steven,
>> 
>> I remember using the similar one on mongo, I don’t remember that eliminating 
>> the records. But I can check. I haven’t used Big Query for this, I used it 
>> for flat tables, but I can check.
>> 
>> Thanks
>> Sudhakar Thota
>> 
>> 
>> 
>> On Feb 3, 2015, at 2:45 PM, Jason Altekruse <[email protected]> wrote:
>> 
>>> Sudhakar opened an issue for this, so I responded there. Steven is right,
>>> this is the current expected functionality, but I discuss there the reasons
>>> for it and opened the discussion for use cases that need this functionality.
>>> 
>>> On Tue, Feb 3, 2015 at 2:28 PM, Steven Phillips <[email protected]>
>>> wrote:
>>> 
>>>> I think flatten is functioning as designed, but perhaps we need to treat
>>>> this case differently. It might make sense to return a single row with a
>>>> null for the flattened column in this case. The only model for "flatten"
>>>> that I know of is google Big Query. We should see how they handle this
>>>> case.
>>>> 
>>>> On Tue, Feb 3, 2015 at 12:27 PM, MapR <[email protected]> wrote:
>>>> 
>>>>> In a table, column with null values we
>>>>> present as nulls or empty values with other columns. I never had an
>>>>> experience with c1, c2, c3 (columns ) not presented when c3 has null
>>>> values
>>>>> even though c1, c2 have notnull values.
>>>>> 
>>>>> Sudhakar Thota
>>>>> Sent from my iPhone
>>>>> 
>>>>>> On Feb 3, 2015, at 11:57 AM, Andries Engelbrecht <
>>>>> [email protected]> wrote:
>>>>>> 
>>>>>> In listing 3 it looks like flatten is doing the correct thing by
>>>>> representing each element in the array with a new row and the element in
>>>>> the column output.
>>>>>> And also for arrays that are empty there is no output row.
>>>>>> 
>>>>>> —Andries
>>>>>> 
>>>>>>> On Feb 3, 2015, at 11:54 AM, Sudhakar Thota <[email protected]>
>>>>> wrote:
>>>>>>> 
>>>>>>> It will be more appropriate if I add my file to have data to
>>>> facilitate
>>>>> quick test.
>>>>>>> 
>>>>>>> [root@maprdemo clicks]# cat sthota_test_1.json
>>>>>>> {"a":"r1cl1", "b":"r1c2","c":{ "x" : 1, "y" : "a string", "z" : [ 1]
>>>> }}
>>>>>>> {"a":"r2cl1", "b":"r2c2","c":{ "x" : 2, "y" : "a string", "z" : [ 1,2]
>>>>> }}
>>>>>>> {"a":"r3cl1", "b":"r3c2","c":{ "x" : 3, "y" : "a string", "z" : [
>>>>> 1,2,3] }}
>>>>>>> {"a":"r4cl1", "b":"r4c2","c":{ "x" : 4, "y" : "a string", "z" : [
>>>>> 1,2,3,4] }}
>>>>>>> {"a":"r5cl1", "b":"r5c2","c":{ "x" : 5, "y" : "a string", "z" : [
>>>>> 1,2,3,4,5] }}
>>>>>>> {"a":"r6cl1", "b":"r6c2","c":{ "x" : 6, "y" : "a string"  }}
>>>>>>> 
>>>>>>> 
>>>>>>> Begin forwarded message:
>>>>>>> 
>>>>>>>> From: Sudhakar Thota <[email protected]>
>>>>>>>> Subject: Drill - Flatten function - help please
>>>>>>>> Date: February 3, 2015 at 11:52:39 AM PST
>>>>>>>> To: [email protected]
>>>>>>>> 
>>>>>>>> Hi Drillers,
>>>>>>>> 
>>>>>>>> I have a request here about flatten function. I want to know if this
>>>>> is an expected behavior of drill flatten function with json documents.
>>>>>>>> 
>>>>>>>> Listing-1 is contents of my json collection where I have
>>>> intentionally
>>>>> included a empty array.
>>>>>>>> Listing-2 shows that if I don’t call flatten function it shows all
>>>>> records.
>>>>>>>> Listing-3 shows that if I call flatten function, row6 (r6cl1…) is
>>>>> ignored. I am using that in my projections not in predicate, but it looks
>>>>> like treated as predicate.
>>>>>>>> 
>>>>>>>> Please correct me if I am wrong and let me know if there is any work
>>>>> around.
>>>>>>>> 
>>>>>>>> Listing:1
>>>>>>>> 
>>>>>>>> 0: jdbc:drill:> SELECT  t.a,t.b,t.c.x, t.c.y, t.c.z from
>>>>> dfs.`/data/nested/clicks/sthota_test_1.json` as t;
>>>>>>>> +------------+------------+------------+------------+------------+
>>>>>>>> |     a      |     b      |   EXPR$2   |   EXPR$3   |   EXPR$4   |
>>>>>>>> +------------+------------+------------+------------+------------+
>>>>>>>> | r1cl1      | r1c2       | 1          | a string   | ["1"]      |
>>>>>>>> | r2cl1      | r2c2       | 2          | a string   | ["1","2"]  |
>>>>>>>> | r3cl1      | r3c2       | 3          | a string   | ["1","2","3"] |
>>>>>>>> | r4cl1      | r4c2       | 4          | a string   |
>>>>> ["1","2","3","4"] |
>>>>>>>> | r5cl1      | r5c2       | 5          | a string   |
>>>>> ["1","2","3","4","5"] |
>>>>>>>> | r6cl1      | r6c2       | 6          | a string   | []         |
>>>>>>>> +------------+------------+------------+------------+------------+
>>>>>>>> 6 rows selected (0.125 seconds)
>>>>>>>> 0: jdbc:drill:>
>>>>>>>> 
>>>>>>>> Listing:2
>>>>>>>> 
>>>>>>>> 0: jdbc:drill:> SELECT  t.a,t.b,t.c.x, t.c.y from
>>>>> dfs.`/data/nested/clicks/sthota_test_1.json` as t;
>>>>>>>> +------------+------------+------------+------------+
>>>>>>>> |     a      |     b      |   EXPR$2   |   EXPR$3   |
>>>>>>>> +------------+------------+------------+------------+
>>>>>>>> | r1cl1      | r1c2       | 1          | a string   |
>>>>>>>> | r2cl1      | r2c2       | 2          | a string   |
>>>>>>>> | r3cl1      | r3c2       | 3          | a string   |
>>>>>>>> | r4cl1      | r4c2       | 4          | a string   |
>>>>>>>> | r5cl1      | r5c2       | 5          | a string   |
>>>>>>>> | r6cl1      | r6c2       | 6          | a string   |
>>>>>>>> +------------+------------+------------+------------+
>>>>>>>> 6 rows selected (0.163 seconds)
>>>>>>>> 
>>>>>>>> Listing:3
>>>>>>>> 
>>>>>>>> 0: jdbc:drill:> SELECT  t.a,t.b,t.c.x, t.c.y, flatten(t.c.z) from
>>>>> dfs.`/data/nested/clicks/sthota_test_1.json` as t;
>>>>>>>> +------------+------------+------------+------------+------------+
>>>>>>>> |     a      |     b      |   EXPR$2   |   EXPR$3   |   EXPR$4   |
>>>>>>>> +------------+------------+------------+------------+------------+
>>>>>>>> | r1cl1      | r1c2       | 1          | a string   | 1          |
>>>>>>>> | r2cl1      | r2c2       | 2          | a string   | 1          |
>>>>>>>> | r2cl1      | r2c2       | 2          | a string   | 2          |
>>>>>>>> | r3cl1      | r3c2       | 3          | a string   | 1          |
>>>>>>>> | r3cl1      | r3c2       | 3          | a string   | 2          |
>>>>>>>> | r3cl1      | r3c2       | 3          | a string   | 3          |
>>>>>>>> | r4cl1      | r4c2       | 4          | a string   | 1          |
>>>>>>>> | r4cl1      | r4c2       | 4          | a string   | 2          |
>>>>>>>> | r4cl1      | r4c2       | 4          | a string   | 3          |
>>>>>>>> | r4cl1      | r4c2       | 4          | a string   | 4          |
>>>>>>>> | r5cl1      | r5c2       | 5          | a string   | 1          |
>>>>>>>> | r5cl1      | r5c2       | 5          | a string   | 2          |
>>>>>>>> | r5cl1      | r5c2       | 5          | a string   | 3          |
>>>>>>>> | r5cl1      | r5c2       | 5          | a string   | 4          |
>>>>>>>> | r5cl1      | r5c2       | 5          | a string   | 5          |
>>>>>>>> +------------+------------+------------+------------+------------+
>>>>>>>> 15 rows selected (0.171 seconds)
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> Sudhakar Thota
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Steven Phillips
>>>> Software Engineer
>>>> 
>>>> mapr.com
>>>> 
>> 
> 

Reply via email to