Zhen Chen created CALCITE-7634:
----------------------------------

             Summary: JoinExpandOrToUnionRule incorrectly expands OR branches 
with non-equi predicates referencing both join inputs
                 Key: CALCITE-7634
                 URL: https://issues.apache.org/jira/browse/CALCITE-7634
             Project: Calcite
          Issue Type: Bug
          Components: core
    Affects Versions: 1.42.0
            Reporter: Zhen Chen
            Assignee: Zhen Chen
             Fix For: 1.43.0


{{JoinExpandOrToUnionRule}} may incorrectly split an OR branch that contains 
both an equi-join predicate and a non-equi predicate referencing both join 
inputs.

SQL:
{code:java}
select *
from EMP as p1
inner join EMP as p2
on (p1.empno = p2.empno and p1.sal < p2.sal)
or (p1.mgr = p2.mgr and p1.comm < p2.comm)
or p1.deptno = p2.deptno {code}
Before the fix, the rule expands the condition into 3 LogicalJoin branches:
{code:java}
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EMPNO0=[$9], ENAME0=[$10], 
JOB0=[$11], MGR0=[$12], HIREDATE0=[$13], SAL0=[$14], COMM0=[$15], 
DEPTNO0=[$16], SLACKER0=[$17])
  LogicalUnion(all=[true])
    LogicalJoin(condition=[AND(=($0, $9), <($5, $14))], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
    LogicalJoin(condition=[AND(=($3, $12), <($6, $15), OR(<>($0, $9), >=($5, 
$14)))], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
    LogicalJoin(condition=[AND(=($7, $16), OR(<>($0, $9), >=($5, $14)), 
OR(<>($3, $12), >=($6, $15)))], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
This is unsafe because predicates such as p1.sal < p2.sal and p1.comm < p2.comm 
reference columns from both join inputs and can evaluate to UNKNOWN under SQL 
three-valued logic. The generated NOT conditions are not equivalent to 
excluding previously matched branches in nullable cases.After the fix, the 
unsafe branches remain grouped, and only the safe equi-join branch is split out:
{code:java}
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EMPNO0=[$9], ENAME0=[$10], 
JOB0=[$11], MGR0=[$12], HIREDATE0=[$13], SAL0=[$14], COMM0=[$15], 
DEPTNO0=[$16], SLACKER0=[$17])
  LogicalUnion(all=[true])
    LogicalJoin(condition=[OR(AND(=($0, $9), <($5, $14)), AND(=($3, $12), <($6, 
$15)))], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
    LogicalJoin(condition=[AND(=($7, $16), OR(<>($0, $9), >=($5, $14)), 
OR(<>($3, $12), >=($6, $15)))], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
Root cause

`RexInputRefCounter.visitInputRef` increments `leftFieldCount` instead of 
`leftInputRefCount`, so predicates referencing the left input may be 
miscounted. 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to