Aaaaaaron commented on a change in pull request #2253:
URL: https://github.com/apache/calcite/pull/2253#discussion_r521022915



##########
File path: core/src/main/java/org/apache/calcite/rex/RexSimplify.java
##########
@@ -1689,6 +1690,123 @@ RexNode simplifyAnd2ForUnknownAsFalse(List<RexNode> 
terms,
     }
   }
 
+  /**
+   * <p>Simplifies AND/OR condition that has common expressions,
+   * extract and eliminate/merge them as much as possible.</p>
+   * <ul>
+   * <li>(a OR b) AND (a OR b OR c OR d) =&gt; (a OR b)
+   * <li>(a OR b OR c OR d) AND (a OR b OR e OR f) =&gt; (a OR b) OR ((c OR d) 
AND (e OR f))
+
+   * <li>(a AND b) OR (a AND b AND c) =&gt; (a AND b)
+   * <li>(a AND b AND c AND d) OR (a AND b AND e AND f) =&gt;
+   * (a AND b) AND ((c AND d) OR (e AND f))
+   * </ul>
+   * <p> The difference between {@link #simplifyAnd} is that {@link 
#simplifyAnd} mainly
+   * simplifies expressions whose answer can be determined without evaluating 
both sides,
+   * like: FALSE AND (xxx) =&gt; FALSE</p>
+   */
+  public RexNode eliminateCommonExprInCondition(RexNode call) {

Review comment:
       Just pasted from CALCITE-4375 if you have no time to read:
   SQL:
   ```
   SELECT * FROM emps,depts
   WHERE
   (emps.name = depts.name AND empno=1)
   OR
   (emps.name = depts.name AND empno=2)
   ```
   
   And the join after optimizer is:
   
   `EnumerableNestedLoopJoin(condition=[OR(AND(=($1, $11), =($0, 1)), AND(=($1, 
$11), =($0, 2)))], joinType=[inner])`
   
   
   In fact ($1, $11) can be extracted, and the join can be:
   
   `HashJoin(condition=[AND(=($1, $11), OR(=($0, 1), =($0, 2)))], 
joinType=[inner]) `
   
   and then the or cond can be pushed down, and the SQL is like
   
   `select * from emps, depts on( emps.name = depts.name ) where (empno=1 OR 
empno=2)`
   
   We found 8 queries like this patten in TPC-DS, he benefits of optimization 
are great:
   1. EnumerableNestedLoopJoin -> HashJoin
   2. filter can be push down
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to