amaliujia commented on a change in pull request #1995:
URL: https://github.com/apache/calcite/pull/1995#discussion_r432927229



##########
File path: core/src/test/resources/org/apache/calcite/test/TopDownOptTest.xml
##########
@@ -648,4 +648,80 @@ EnumerableMergeJoin(condition=[AND(=($1, $5), =($0, $4))], 
joinType=[inner])
 ]]>
     </Resource>
   </TestCase>
+    <TestCase name="testHashJoinInnerJoinNotPushDownSort">
+        <Resource name="sql">
+            <![CDATA[
+"select * from
+sales.emp r join sales.bonus s on r.ename=s.ename and r.job=s.job
+order by r.job desc nulls last, r.ename nulls first
+]]>
+        </Resource>
+        <Resource name="planBefore">
+            <![CDATA[
+LogicalSort(sort0=[$2], sort1=[$1], dir0=[DESC-nulls-last], 
dir1=[ASC-nulls-first])
+  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], ENAME0=[$9], JOB0=[$10], 
SAL0=[$11], COMM0=[$12])
+    LogicalJoin(condition=[AND(=($1, $9), =($2, $10))], joinType=[inner])
+      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
+      LogicalTableScan(table=[[CATALOG, SALES, BONUS]])
+]]>
+        </Resource>
+        <Resource name="planAfter">
+            <![CDATA[
+EnumerableSort(sort0=[$2], sort1=[$1], dir0=[DESC-nulls-last], 
dir1=[ASC-nulls-first])
+  EnumerableHashJoin(condition=[AND(=($1, $9), =($2, $10))], joinType=[inner])
+    EnumerableTableScan(table=[[CATALOG, SALES, EMP]])
+    EnumerableTableScan(table=[[CATALOG, SALES, BONUS]])
+]]>
+        </Resource>
+    </TestCase>
+    <TestCase name="testHashJoinFullOuterJoinNotPushDownSort">
+        <Resource name="sql">
+            <![CDATA[
+"select * from
+sales.emp r full outer join sales.bonus s on r.ename=s.ename and r.job=s.job
+order by r.job desc nulls last, r.ename nulls first
+]]>
+        </Resource>
+        <Resource name="planBefore">
+            <![CDATA[
+LogicalSort(sort0=[$2], sort1=[$1], dir0=[DESC-nulls-last], 
dir1=[ASC-nulls-first])
+  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], ENAME0=[$9], JOB0=[$10], 
SAL0=[$11], COMM0=[$12])
+    LogicalJoin(condition=[AND(=($1, $9), =($2, $10))], joinType=[full])
+      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
+      LogicalTableScan(table=[[CATALOG, SALES, BONUS]])
+]]>
+        </Resource>
+        <Resource name="planAfter">
+            <![CDATA[
+EnumerableSort(sort0=[$2], sort1=[$1], dir0=[DESC-nulls-last], 
dir1=[ASC-nulls-first])
+  EnumerableHashJoin(condition=[AND(=($1, $9), =($2, $10))], joinType=[full])
+    EnumerableTableScan(table=[[CATALOG, SALES, EMP]])
+    EnumerableTableScan(table=[[CATALOG, SALES, BONUS]])
+]]>
+        </Resource>
+    </TestCase>
+    <TestCase name="testHashJoinLeftOuterJoinPushDownSort">
+        <Resource name="sql">
+            <![CDATA[select * from
+        sales.emp r left outer join sales.bonus s on r.ename=s.ename and 
r.job=s.job
+        order by r.job desc nulls last, r.ename nulls first]]>
+        </Resource>
+        <Resource name="planBefore">
+            <![CDATA[
+LogicalSort(sort0=[$2], sort1=[$1], dir0=[DESC-nulls-last], 
dir1=[ASC-nulls-first])
+  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], ENAME0=[$9], JOB0=[$10], 
SAL0=[$11], COMM0=[$12])
+    LogicalJoin(condition=[AND(=($1, $9), =($2, $10))], joinType=[left])
+      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
+      LogicalTableScan(table=[[CATALOG, SALES, BONUS]])
+]]>
+        </Resource>
+        <Resource name="planAfter">
+            <![CDATA[
+EnumerableHashJoin(condition=[AND(=($1, $9), =($2, $10))], joinType=[left])
+  EnumerableSort(sort0=[$2], sort1=[$1], dir0=[DESC-nulls-last], 
dir1=[ASC-nulls-first])
+    EnumerableTableScan(table=[[CATALOG, SALES, EMP]])
+    EnumerableTableScan(table=[[CATALOG, SALES, BONUS]])

Review comment:
       @hsyuan 
   
   I want to see this plan (sort on left join input for this left outer join), 
however its cost is too high(`{67.94680261461362 rows, 1346.0848941260901 cpu, 
0.0 io}`)
   
   So Calcite always chooses 
   ```
   EnumerableSort(sort0=[$2], sort1=[$1], dir0=[DESC-nulls-last], 
dir1=[ASC-nulls-first])
     EnumerableHashJoin(condition=[AND(=($1, $9), =($2, $10))], joinType=[left])
       EnumerableTableScan(table=[[CATALOG, SALES, EMP]])
       EnumerableTableScan(table=[[CATALOG, SALES, BONUS]])
   ```
   whose cost is `{54.94680261461362 rows, 68.0 cpu, 0.0 io}`.
   
   It could be make sense if EnumerableHashJoin reduces number of tuples thus 
Sort at the end could have less cost to sort joined result.
   
   
   Do you have a suggestion how to deterministically test trait pass down for 
EnumerableHashJoin?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to