amaliujia commented on a change in pull request #1995:
URL: https://github.com/apache/calcite/pull/1995#discussion_r433701431
##########
File path: core/src/test/java/org/apache/calcite/test/TopDownOptTest.java
##########
@@ -288,6 +290,44 @@
.removeRule(EnumerableRules.ENUMERABLE_JOIN_RULE)
.check();
}
+
+ // Not push down sort for hash join in inner join case.
+ @Test void testHashJoinInnerJoinNotPushDownSort() {
+ final String sql = "select * from\n"
+ + "sales.emp r join sales.bonus s on r.ename=s.ename and r.job=s.job\n"
+ + "order by r.job desc nulls last, r.ename nulls first";
+ Query.create(sql)
+ .removeRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE)
+ .removeRule(EnumerableRules.ENUMERABLE_BATCH_NESTED_LOOP_JOIN_RULE)
+ .check();
+ }
+
+ // Not push down sort for hash join in full outer join case.
+ @Test void testHashJoinFullOuterJoinNotPushDownSort() {
+ final String sql = "select * from\n"
+ + "sales.emp r full outer join sales.bonus s on r.ename=s.ename and
r.job=s.job\n"
+ + "order by r.job desc nulls last, r.ename nulls first";
+ Query.create(sql)
+ .removeRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE)
+ .removeRule(EnumerableRules.ENUMERABLE_BATCH_NESTED_LOOP_JOIN_RULE)
+ .check();
+ }
+
+ // Push down sort to left input.
+ @Test void testHashJoinLeftOuterJoinPushDownSort() {
+ final String sql = "select * from\n"
+ + "(select ename from sales.emp UNION ALL select ename from sales.emp
UNION ALL select ename from sales.emp UNION ALL select ename from sales.emp
UNION ALL select ename from sales.emp UNION ALL select ename from sales.emp
UNION ALL select ename from sales.emp UNION ALL select ename from sales.emp
UNION ALL select ename from sales.emp) r left outer join\n"
+ + "(select ename from sales.bonus UNION ALL select ename from
sales.bonus UNION ALL select ename from sales.bonus UNION ALL select ename from
sales.bonus UNION ALL select ename from sales.bonus UNION ALL select ename from
sales.bonus UNION ALL select ename from sales.bonus UNION ALL select ename from
sales.bonus UNION ALL select ename from sales.bonus UNION ALL select ename from
sales.bonus) s\n"
+ + "on r.ename=s.ename\n"
+ + "order by r.ename nulls first";
Review comment:
I know the test query becomes ugly and not easy to check, but I don't
know if there is a better idea.
##########
File path: core/src/test/java/org/apache/calcite/test/TopDownOptTest.java
##########
@@ -288,6 +290,44 @@
.removeRule(EnumerableRules.ENUMERABLE_JOIN_RULE)
.check();
}
+
+ // Not push down sort for hash join in inner join case.
+ @Test void testHashJoinInnerJoinNotPushDownSort() {
+ final String sql = "select * from\n"
+ + "sales.emp r join sales.bonus s on r.ename=s.ename and r.job=s.job\n"
+ + "order by r.job desc nulls last, r.ename nulls first";
+ Query.create(sql)
+ .removeRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE)
+ .removeRule(EnumerableRules.ENUMERABLE_BATCH_NESTED_LOOP_JOIN_RULE)
+ .check();
+ }
+
+ // Not push down sort for hash join in full outer join case.
+ @Test void testHashJoinFullOuterJoinNotPushDownSort() {
+ final String sql = "select * from\n"
+ + "sales.emp r full outer join sales.bonus s on r.ename=s.ename and
r.job=s.job\n"
+ + "order by r.job desc nulls last, r.ename nulls first";
+ Query.create(sql)
+ .removeRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE)
+ .removeRule(EnumerableRules.ENUMERABLE_BATCH_NESTED_LOOP_JOIN_RULE)
+ .check();
+ }
+
+ // Push down sort to left input.
+ @Test void testHashJoinLeftOuterJoinPushDownSort() {
+ final String sql = "select * from\n"
+ + "(select ename from sales.emp UNION ALL select ename from sales.emp
UNION ALL select ename from sales.emp UNION ALL select ename from sales.emp
UNION ALL select ename from sales.emp UNION ALL select ename from sales.emp
UNION ALL select ename from sales.emp UNION ALL select ename from sales.emp
UNION ALL select ename from sales.emp) r left outer join\n"
+ + "(select ename from sales.bonus UNION ALL select ename from
sales.bonus UNION ALL select ename from sales.bonus UNION ALL select ename from
sales.bonus UNION ALL select ename from sales.bonus UNION ALL select ename from
sales.bonus UNION ALL select ename from sales.bonus UNION ALL select ename from
sales.bonus UNION ALL select ename from sales.bonus UNION ALL select ename from
sales.bonus) s\n"
+ + "on r.ename=s.ename\n"
+ + "order by r.ename nulls first";
Review comment:
@hsyuan
Finally I find a way to "fool" planner to choose the plan I want, basically
the plan that pushes sort down to left child of hash join.
The reason that simple join's cheapest plan has a Sort on top of HashJoin,
is because the Join's estimated is computed as simple as `left_row_count *
right_row_count * 0.15`, 0.15 is selectively. When inputs's row count is small,
the selectively plays a bigger role, thus HashJoin gives a small cost, thus
Sort is put after Join because Join gives a very small row count estimation.
So I intentionally use UNION to increase input size, which make the
`left_row_count * right_row_count` plays a bigger role for Join, and then Join
produces larger row count estimation. Then finally planner believes push down
Sort is cheaper, thus gives the plan I want to see.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]