mihaibudiu commented on code in PR #4814:
URL: https://github.com/apache/calcite/pull/4814#discussion_r2875065788
##########
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java:
##########
@@ -787,6 +788,54 @@ private static RexNode rewriteIn(RexSubQuery e,
Set<CorrelationId> variablesSet,
// Now the left join
builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);
+ // for multi-column IN with nullable RHS columns, the global ck < c check
is
+ // insufficient. Consider:
+ //
+ // (empno, deptno) IN (select empno, deptno from ...)
+ //
+ // where one RHS row is (7521, null):
+ // empno=7521: AND(TRUE, UNKNOWN) = UNKNOWN → result should be null
Review Comment:
I also don't understand what this AND is
##########
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java:
##########
@@ -787,6 +788,54 @@ private static RexNode rewriteIn(RexSubQuery e,
Set<CorrelationId> variablesSet,
// Now the left join
builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);
+ // for multi-column IN with nullable RHS columns, the global ck < c check
is
+ // insufficient. Consider:
+ //
+ // (empno, deptno) IN (select empno, deptno from ...)
+ //
+ // where one RHS row is (7521, null):
+ // empno=7521: AND(TRUE, UNKNOWN) = UNKNOWN → result should be null
+ // empno=7369: AND(FALSE, UNKNOWN) = FALSE → result should be false
+ //
+ // ck < c only tells us "some RHS row has a null", but cannot tell which
+ // LHS keys are affected. Instead, we LEFT JOIN a null-rows table (nr) of
+ // RHS rows that have at least one NULL column, using per-column condition
+ // (col IS NULL OR key = col), where a NULL col is a wildcard.
+ // nr.i IS NOT NULL means a null-row matched the LHS key, result is
UNKNOWN.
+ // nr.i IS NULL means no null-row matched, fall through to false.
+ String nrAlias = "nr";
+ if (subQueryIndex != 0) {
+ nrAlias = "nr" + subQueryIndex;
+ }
+ // Only needed for multi-column, non-literal IN with null-safety logic.
+ // Single-column cases are handled correctly by the existing ck < c check.
Review Comment:
What happens if the single column has a ROW type?
##########
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java:
##########
@@ -787,6 +788,54 @@ private static RexNode rewriteIn(RexSubQuery e,
Set<CorrelationId> variablesSet,
// Now the left join
builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);
+ // for multi-column IN with nullable RHS columns, the global ck < c check
is
Review Comment:
The comment here needs some context: the actual comparison `ck < c` is much
lower.
Perhaps you can first explain how the expansion is handled for scalar values.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]