[ 
https://issues.apache.org/jira/browse/DRILL-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528026#comment-16528026
 ] 

ASF GitHub Bot commented on DRILL-6553:
---------------------------------------

vvysotskyi commented on a change in pull request #1353: DRILL-6553: Fix TopN 
for unnest operator
URL: https://github.com/apache/drill/pull/1353#discussion_r199237071
 
 

 ##########
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/lateraljoin/TestE2EUnnestAndLateral.java
 ##########
 @@ -394,4 +401,30 @@ public void testLateral_HashAgg_with_nulls() throws 
Exception {
       .baselineValues("dd",222L)
       .build().run();
   }
+
+  @Test
+  public void testUnnestTopN() throws Exception {
 
 Review comment:
   Thanks, done.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Fix TopN for unnest operator
> ----------------------------
>
>                 Key: DRILL-6553
>                 URL: https://issues.apache.org/jira/browse/DRILL-6553
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Volodymyr Vysotskyi
>            Assignee: Volodymyr Vysotskyi
>            Priority: Major
>             Fix For: 1.14.0
>
>
> Plan for the query with unnest is chosen non-optimally:
> {code:sql}
> select customer.c_custkey, customer.c_name, t.o.o_orderkey,t.o.o_totalprice
> from dfs.`lateraljoin/multipleFiles` customer,
> unnest(customer.c_orders) t(o)
> order by customer.c_custkey, t.o.o_orderkey, t.o.o_totalprice
> limit 50
> {code}
> Plan:
> {noformat}
> 00-00    Screen
> 00-01      ProjectAllowDup(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], 
> EXPR$3=[$3])
> 00-02        SelectionVectorRemover
> 00-03          Limit(fetch=[50])
> 00-04            SelectionVectorRemover
> 00-05              Sort(sort0=[$0], sort1=[$2], sort2=[$3], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 00-06                Project(c_custkey=[$2], c_name=[$3], EXPR$2=[ITEM($4, 
> 'o_orderkey')], EXPR$3=[ITEM($4, 'o_totalprice')])
> 00-07                  LateralJoin(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{1}])
> 00-09                    Project(T0¦¦**=[$0], c_orders=[$1], c_custkey=[$2], 
> c_name=[$3])
> 00-11                      Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`**`], 
> files=[file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_2.json,
>  
> file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_1.json]]])
> 00-08                    Project(c_orders0=[$0])
> 00-10                      Unnest [srcOp=00-07] 
> {noformat}
> A similar query, but with flatten:
> {code:sql}
> select f.c_custkey, f.c_name, f.o.o_orderkey, f.o.o_totalprice from (select 
> c_custkey, c_name, flatten(c_orders) as o from 
> dfs.`lateraljoin/multipleFiles` customer) f order by f.c_custkey, 
> f.o.o_orderkey, f.o.o_totalprice limit 50
> {code}
> has plan:
> {noformat}
> 00-00    Screen
> 00-01      Project(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], EXPR$3=[$3])
> 00-02        SelectionVectorRemover
> 00-03          Limit(fetch=[50])
> 00-04            SelectionVectorRemover
> 00-05              TopN(limit=[50])
> 00-06                Project(c_custkey=[$0], c_name=[$1], EXPR$2=[ITEM($2, 
> 'o_orderkey')], EXPR$3=[ITEM($2, 'o_totalprice')])
> 00-07                  Flatten(flattenField=[$2])
> 00-08                    Project(c_custkey=[$0], c_name=[$1], o=[$2])
> 00-09                      Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`c_custkey`, `c_name`, `c_orders`], 
> files=[file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_2.json,
>  
> file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_1.json]]])
> {noformat}
> The main difference is that for the case of unnest, a project wasn't pushed 
> to the scan and Limit with Sort weren't converted to TopN. 
> The first problem is tracked by DRILL-6545 and this Jira aims to fix the 
> problem with TopN



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to