Runkang He created CALCITE-5680:
-----------------------------------

             Summary: Wrong plan for multiple IN sub-queries with only literal 
operands
                 Key: CALCITE-5680
                 URL: https://issues.apache.org/jira/browse/CALCITE-5680
             Project: Calcite
          Issue Type: Bug
    Affects Versions: 1.34.0
            Reporter: Runkang He


When the query contains multiple IN sub-queries with only literal operands, and 
connected with OR predicate in WHERE clause, the result is wrong. The minimal 
reproducer is below:
SQL:

{code:sql}
select empno from sales.empnullables
where 1 in (
  select deptno from sales.deptnullables where name = 'dept1')
or 2 in (
  select deptno from sales.deptnullables where name = 'dept2')
{code}

The Plan generated by calcite master branch: (Notice the bold part of IS 
NULL(*$2*) in the downstream LogicalFilter)

{code:sql}
LogicalProject(EMPNO=[$0])
  LogicalFilter(condition=[OR(CASE(IS NULL($2), false, =($1, false), 
null:BOOLEAN, IS NOT NULL($1), true, false), CASE(IS NULL(*$2*), false, =($1, 
false), null:BOOLEAN, IS NOT NULL($1), true, false))])
    LogicalJoin(condition=[true], joinType=[left])
      LogicalJoin(condition=[true], joinType=[left])
        LogicalProject(EMPNO=[$0])
          LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
        LogicalSort(sort0=[$0], dir0=[DESC], fetch=[1])
          LogicalAggregate(group=[{0}], c=[COUNT()])
            LogicalProject(cs=[IS NOT NULL($0)])
              LogicalFilter(condition=[OR(=(1, $0), IS NULL($0))])
                LogicalProject(DEPTNO=[$0])
                  LogicalFilter(condition=[=($1, 'dept1')])
                    LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
      LogicalSort(sort0=[$0], dir0=[DESC], fetch=[1])
        LogicalAggregate(group=[{0}], c=[COUNT()])
          LogicalProject(cs=[IS NOT NULL($0)])
            LogicalFilter(condition=[OR(=(2, $0), IS NULL($0))])
              LogicalProject(DEPTNO=[$0])
                LogicalFilter(condition=[=($1, 'dept2')])
                  LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
{code}

The wrong part is that when build the downstream LogicalFilter for the two 
sub-queries, the filter for the second sub-query is CASE(IS NULL($2), false, 
=($1, false), null:BOOLEAN, IS NOT NULL($1), true, false), notice that *$2 
should be the second sub-query's intermediate table field dt.c(which field 
index is $4), and $1 should be the second sub-query's intermediate table field 
dt.cs(which field index is $3), but now the actual reference is the first 
sub-query's, this leads to wrong plan, and wrong result*.

The root cause is that intermediate table alias is the same as the previous 
sub-query's, but when lookup intermediate table field, it always returns the 
previous one which is not belong to the current subquery.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to