Aaaaaaron commented on a change in pull request #2253:
URL: https://github.com/apache/calcite/pull/2253#discussion_r546313841
##########
File path:
core/src/main/java/org/apache/calcite/rel/rules/ReduceExpressionsRule.java
##########
@@ -366,14 +368,18 @@ public JoinReduceExpressionsRule(Class<? extends Join>
joinClass,
@Override public void onMatch(RelOptRuleCall call) {
final Join join = call.rel(0);
- final List<RexNode> expList = Lists.newArrayList(join.getCondition());
+ final RexBuilder rexBuilder = join.getCluster().getRexBuilder();
+ final RexSimplify rexSimplify =
+ new RexSimplify(rexBuilder, RelOptPredicateList.EMPTY, EXECUTOR);
+ final RexNode condition =
rexSimplify.eliminateCommonExprInCondition(join.getCondition());
+
+ final List<RexNode> expList = Lists.newArrayList(condition);
Review comment:
Hi @julianhyde
The improvement as I comment in JIRA, both have ut in this PR, yes, it's
JoinReduceExpressionsRule, if you do not like the place I put the method, I can
move it someplace else.
This optimization's benefit is pretty good and also it has been implemented
in some engines like spark/presto :
SQL:
```
SELECT * FROM emps,depts
WHERE
(emps.name = depts.name AND empno=1)
OR
(emps.name = depts.name AND empno=2)
```
And the join after optimizer is:
`EnumerableNestedLoopJoin(condition=[OR(AND(=($1, $11), =($0, 1)), AND(=($1,
$11), =($0, 2)))], joinType=[inner])`
In fact ($1, $11) can be extracted, and the join can be:
`HashJoin(condition=[AND(=($1, $11), OR(=($0, 1), =($0, 2)))],
joinType=[inner]) `
**We found 8 queries like this patten in TPC-DS, he benefits of optimization
are great:**
1. EnumerableNestedLoopJoin -> HashJoin
2. filter can be push down
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]