mihaibudiu commented on code in PR #4814:
URL: https://github.com/apache/calcite/pull/4814#discussion_r2881046133


##########
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java:
##########
@@ -787,6 +788,60 @@ private static RexNode rewriteIn(RexSubQuery e, 
Set<CorrelationId> variablesSet,
     // Now the left join
     builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);
 
+    // for single-column IN (e.g. deptno IN (select deptno ...)), the CASE
+    // expression above includes:
+    //   when ct.ck < ct.c then null   -- some RHS value is NULL
+    // this is correct: if no direct match was found (dt.i IS NULL) and any
+    // RHS value is NULL, the answer for every unmatched LHS key is UNKNOWN,
+    // because we cannot rule out that the LHS key equals the NULL.

Review Comment:
   equals the NULL -> is NULL.



##########
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java:
##########
@@ -787,6 +788,60 @@ private static RexNode rewriteIn(RexSubQuery e, 
Set<CorrelationId> variablesSet,
     // Now the left join
     builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);
 
+    // for single-column IN (e.g. deptno IN (select deptno ...)), the CASE

Review Comment:
   I think this would be easier to read if you showed first an example of the 
LEFT JOIN that is being produced and then followed with these comments. 



##########
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java:
##########
@@ -787,6 +788,60 @@ private static RexNode rewriteIn(RexSubQuery e, 
Set<CorrelationId> variablesSet,
     // Now the left join
     builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);
 
+    // for single-column IN (e.g. deptno IN (select deptno ...)), the CASE
+    // expression above includes:
+    //   when ct.ck < ct.c then null   -- some RHS value is NULL
+    // this is correct: if no direct match was found (dt.i IS NULL) and any
+    // RHS value is NULL, the answer for every unmatched LHS key is UNKNOWN,
+    // because we cannot rule out that the LHS key equals the NULL.
+    //
+    // for multi-column IN, the global ck < c check is insufficient. Consider:
+    //   (empno, deptno) IN (select empno, deptno from ...)
+    // where one RHS row is (7521, null). For each LHS row, the dt join 
condition
+    // "empno = rhs.empno AND deptno = rhs.deptno" becomes "empno = 7521 AND 
deptno = null":
+    //   for LHS empno=7521: (7521 = 7521) AND (7521 = null) = TRUE  AND 
UNKNOWN = UNKNOWN

Review Comment:
   This is still confusing because `=` is used in two ways, maybe you can 
replace some `=` to "which reduces to"



##########
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java:
##########
@@ -826,16 +881,20 @@ private static RexNode rewriteIn(RexSubQuery e, 
Set<CorrelationId> variablesSet,
       operands.add(builder.isNotNull(builder.field(dtAlias, "cs")),
           trueLiteral);
     } else {
-      operands.add(builder.isNotNull(last(builder.fields())),
+      // Use alias-based field reference so that adding nr JOIN doesn't shift 
index
+      operands.add(builder.isNotNull(builder.field(dtAlias, "i")),
           trueLiteral);
     }
 
     if (!allLiterals) {
       switch (logic) {
       case TRUE_FALSE_UNKNOWN:
       case UNKNOWN_AS_TRUE:
-        // only reference ctAlias if we created it
-        if (needsNullSafety) {
+        if (needsNullRowJoin) {
+          // nr.i IS NOT NULL: LHS key matched a null-row in RHS → UNKNOWN.
+          operands.add(builder.isNotNull(builder.field(nrAlias, "i")), b);
+        } else if (needsNullSafety) {

Review Comment:
   is the original comment valuable? If so, maybe you can move it here
   Also, after this PR is merged, the word "traditional" will make no sense 
anymore - people who will read the code won't know that it was here before the 
other code was added. So I would just remove it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to