[ 
https://issues.apache.org/jira/browse/DRILL-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297870#comment-14297870
 ] 

Rahul Challapalli commented on DRILL-2121:
------------------------------------------

Explain plan for the above query :
{code}
 00-00    Screen
00-01      Project(uid=[$0], event=[$1], transaction=[$2])
00-02        Project(uid=[$0], event=[$1], transaction=[$3])
00-03          HashJoin(condition=[=($0, $2)], joinType=[inner])
00-05            Project(uid=[$1], event=[$2])
00-07              Flatten(flattenField=[$2])
00-09                Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$0])
00-11                  Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/flatten_operators/temp3.json, numFiles=1, 
columns=[`uid`, `events`], 
files=[maprfs:/drill/testdata/flatten_operators/temp3.json]]])
00-04            Project(uid0=[$0], transaction=[$1])
00-06              Project(uid=[$0], transaction=[$2])
00-08                Flatten(flattenField=[$2])
00-10                  Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$1])
00-12                    Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/flatten_operators/temp4.json, numFiles=1, 
columns=[`uid`, `transactions`], 
files=[maprfs:/drill/testdata/flatten_operators/temp4.json]]])
{code}

> Join on complex data with sub-queries is returning empty maps
> -------------------------------------------------------------
>
>                 Key: DRILL-2121
>                 URL: https://issues.apache.org/jira/browse/DRILL-2121
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Operators
>            Reporter: Rahul Challapalli
>            Assignee: Hanifi Gunes
>            Priority: Critical
>
> git.commit.id.abbrev=3e33880
> Data Set :
> {code}
> {
>     "uid": 1,
>     "events" : [
>         { "evnt_id":"e1", "campaign_id":"c1", "event_name":"e1_name", 
> "event_time":1000000, "type" : "cmpgn1"},
>         { "evnt_id":"e2", "campaign_id":"c1", "event_name":"e2_name", 
> "event_time":2000000, "type" : "cmpgn4"},
>         { "evnt_id":"e3", "campaign_id":"c1", "event_name":"e3_name", 
> "event_time":3000000, "type" : "cmpgn1"},
>         { "evnt_id":"e4", "campaign_id":"c1", "event_name":"e4_name", 
> "event_time":4000000, "type" : "cmpgn1"},
>         { "evnt_id":"e5", "campaign_id":"c2", "event_name":"e5_name", 
> "event_time":5000000, "type" : "cmpgn3"},
>         { "evnt_id":"e6", "campaign_id":"c1", "event_name":"e6_name", 
> "event_time":6000000, "type" : "cmpgn9"},
>         { "evnt_id":"e7", "campaign_id":"c1", "event_name":"e7_name", 
> "event_time":7000000, "type" : "cmpgn3"},
>         { "evnt_id":"e8", "campaign_id":"c2", "event_name":"e8_name", 
> "event_time":8000000, "type" : "cmpgn2"},
>         { "evnt_id":"e9", "campaign_id":"c2", "event_name":"e9_name", 
> "event_time":9000000, "type" : "cmpgn4"}
>   ],
>   "transactions" : [
>        { "trans_id":"t1", "amount":100, "trans_time":7777777, 
> "type":"sports"},
>        { "trans_id":"t2", "amount":1000, "trans_time":8888888, 
> "type":"groceries"}
>   ]
> }
> {code}
> The below query returns empty maps for the 3rd field
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDir> select t1.uid, t1.event, 
> t2.transaction from  (select uid, flatten(events) event from `temp4.json`) t1 
> inner join (select uid, flatten(transactions) transaction from `temp4.json`) 
> t2 on t1.uid = t2.uid;
> +------------+------------+-------------+
> |    uid     |   event    | transaction |
> +------------+------------+-------------+
> | 1          | 
> {"evnt_id":"e1","campaign_id":"c1","event_name":"e1_name","event_time":1000000,"type":"cmpgn1"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e1","campaign_id":"c1","event_name":"e1_name","event_time":1000000,"type":"cmpgn1"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e2","campaign_id":"c1","event_name":"e2_name","event_time":2000000,"type":"cmpgn4"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e2","campaign_id":"c1","event_name":"e2_name","event_time":2000000,"type":"cmpgn4"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e3","campaign_id":"c1","event_name":"e3_name","event_time":3000000,"type":"cmpgn1"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e3","campaign_id":"c1","event_name":"e3_name","event_time":3000000,"type":"cmpgn1"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e4","campaign_id":"c1","event_name":"e4_name","event_time":4000000,"type":"cmpgn1"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e4","campaign_id":"c1","event_name":"e4_name","event_time":4000000,"type":"cmpgn1"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e5","campaign_id":"c2","event_name":"e5_name","event_time":5000000,"type":"cmpgn3"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e5","campaign_id":"c2","event_name":"e5_name","event_time":5000000,"type":"cmpgn3"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e6","campaign_id":"c1","event_name":"e6_name","event_time":6000000,"type":"cmpgn9"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e6","campaign_id":"c1","event_name":"e6_name","event_time":6000000,"type":"cmpgn9"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e7","campaign_id":"c1","event_name":"e7_name","event_time":7000000,"type":"cmpgn3"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e7","campaign_id":"c1","event_name":"e7_name","event_time":7000000,"type":"cmpgn3"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e8","campaign_id":"c2","event_name":"e8_name","event_time":8000000,"type":"cmpgn2"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e8","campaign_id":"c2","event_name":"e8_name","event_time":8000000,"type":"cmpgn2"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e9","campaign_id":"c2","event_name":"e9_name","event_time":9000000,"type":"cmpgn4"}
>  | {}          |
> | 1          | 
> {"evnt_id":"e9","campaign_id":"c2","event_name":"e9_name","event_time":9000000,"type":"cmpgn4"}
>  | {}          |
> {code}
> If we interchange the sub-queries, drill returns an empty map for event
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDir> select t1.uid, t2.event, 
> t1.transaction from  (select uid, flatten(transactions) transaction from 
> `temp4.json`) t1 inner join (select uid, flatten(events) event from 
> `temp4.json`) t2 on t1.uid = t2.uid;
> +------------+------------+-------------+
> |    uid     |   event    | transaction |
> +------------+------------+-------------+
> | 1          | {}         | 
> {"trans_id":"t1","amount":100,"trans_time":7777777,"type":"sports"} |
> | 1          | {}         | 
> {"trans_id":"t1","amount":100,"trans_time":7777777,"type":"sports"} |
> | 1          | {}         | 
> {"trans_id":"t1","amount":100,"trans_time":7777777,"type":"sports"} |
> | 1          | {}         | 
> {"trans_id":"t1","amount":100,"trans_time":7777777,"type":"sports"} |
> | 1          | {}         | 
> {"trans_id":"t1","amount":100,"trans_time":7777777,"type":"sports"} |
> | 1          | {}         | 
> {"trans_id":"t1","amount":100,"trans_time":7777777,"type":"sports"} |
> | 1          | {}         | 
> {"trans_id":"t1","amount":100,"trans_time":7777777,"type":"sports"} |
> | 1          | {}         | 
> {"trans_id":"t1","amount":100,"trans_time":7777777,"type":"sports"} |
> | 1          | {}         | 
> {"trans_id":"t1","amount":100,"trans_time":7777777,"type":"sports"} |
> | 1          | {}         | 
> {"trans_id":"t2","amount":1000,"trans_time":8888888,"type":"groceries"} |
> | 1          | {}         | 
> {"trans_id":"t2","amount":1000,"trans_time":8888888,"type":"groceries"} |
> | 1          | {}         | 
> {"trans_id":"t2","amount":1000,"trans_time":8888888,"type":"groceries"} |
> | 1          | {}         | 
> {"trans_id":"t2","amount":1000,"trans_time":8888888,"type":"groceries"} |
> | 1          | {}         | 
> {"trans_id":"t2","amount":1000,"trans_time":8888888,"type":"groceries"} |
> | 1          | {}         | 
> {"trans_id":"t2","amount":1000,"trans_time":8888888,"type":"groceries"} |
> | 1          | {}         | 
> {"trans_id":"t2","amount":1000,"trans_time":8888888,"type":"groceries"} |
> | 1          | {}         | 
> {"trans_id":"t2","amount":1000,"trans_time":8888888,"type":"groceries"} |
> | 1          | {}         | 
> {"trans_id":"t2","amount":1000,"trans_time":8888888,"type":"groceries"} |
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to