hsyuan commented on a change in pull request #1157: [CALCITE-2969] Improve 
design of join-like relational expressions
URL: https://github.com/apache/calcite/pull/1157#discussion_r279158703
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/rel/core/Join.java
 ##########
 @@ -230,6 +237,17 @@ public boolean isSemiJoinDone() {
     return false;
   }
 
+  /**
+   * Returns whether this LogicalJoin is a semi-join but does
+   * not comes from decorrelate.
+   *
+   * @return true if this is semi but without correlate variables.
+   */
+  public boolean isNonCorrelateSemiJoin() {
+    return (this.variablesSet == null || this.variablesSet.size() == 0)
+        && joinType == JoinRelType.SEMI;
+  }
 
 Review comment:
   NestedLoop can represent both Join and Correlate. Say for query
   ```
   select * from R, S where r1 > s1;
   ```
   
   We can have plan:
   ```
   NLJ (r1 > s1)
    +- R
    +- S
   
   for (r in R) {
     for (s in S) {
       if (condition(r, s) is true)
            output r,s;
     }
   }
   ```
   
   or 
   ```
   NLJ (true)
     +- R
     +- Filter (r1 > s1)
           +- S
   
   for (r in R) {
     while (s = scanNext(innerRel, r))
        output r,s;
   }
   ```
   
   The difference is can we start fetching inner tuple without knowing the 
tuple from outer. It is impossible for the 2nd, which is a Correlate 
implementation. It is especially true if the inner is a index scan.
   
   It is true that we can always use the second one to implement it, depending 
on the implementation, that's why we have JoinToCorrelate rule.
   
   But logically I think we'd better not mix them.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to