[ 
https://issues.apache.org/jira/browse/CALCITE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17717887#comment-17717887
 ] 

Jiajun Xie commented on CALCITE-5680:
-------------------------------------

It it a bug, incorrect SQL is more obvious.
{code:java}
select  "$cor0"."EMPNO"
from    (
            (
                select  "EMPNO",
                        "ENAME"
                from    "CATALOG"."SALES"."EMPNULLABLES"
            ) as "$cor0",
            LATERAL (
                select  "DEPTNO" is not null as "cs"
                from    (
                            select  "DEPTNO"
                            from    "CATALOG"."SALES"."DEPTNULLABLES"
                            where   "$cor0"."ENAME" = "NAME"
                            and     "DEPTNO" > 10
                        ) as "t1"
                where   1 = "DEPTNO"
                or      "DEPTNO" is null
                group by
                        "DEPTNO" is not null
            ) as "t4"
        ) as "$cor0",
        LATERAL (
            select  "DEPTNO" is not null as "cs"
            from    (
                        select  "DEPTNO"
                        from    "CATALOG"."SALES"."DEPTNULLABLES"
                        where   "$cor0"."ENAME" = "NAME"
                        and     "DEPTNO" < 20
                    ) as "t6"
            where   2 = "DEPTNO"
            or      "DEPTNO" is null
            group by
                    "DEPTNO" is not null
        ) as "t9"
where   case when "$cor0"."cs" = FALSE then null
             when "$cor0"."cs" is not null then true
             else FALSE
        end
or      case when "$cor0"."cs" = FALSE then null   -- t9 was missing
             when "$cor0"."cs" is not null then true
             else FALSE
        end{code}
 

> Wrong plan for multiple IN sub-queries with only literal operands
> -----------------------------------------------------------------
>
>                 Key: CALCITE-5680
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5680
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.34.0
>            Reporter: Runkang He
>            Assignee: Runkang He
>            Priority: Major
>              Labels: pull-request-available
>
> When the query contains multiple IN sub-queries with only literal operands, 
> and connected with OR predicate in WHERE clause, the result is wrong. The 
> minimal reproducer is below:
> SQL:
> {code:sql}
> select empno from sales.empnullables
> where 1 in (
>   select deptno from sales.deptnullables where name = 'dept1')
> or 2 in (
>   select deptno from sales.deptnullables where name = 'dept2')
> {code}
> The Plan generated by calcite master branch: (Notice the bold part of IS 
> NULL(*$2*) in the downstream LogicalFilter)
> {code:sql}
> LogicalProject(EMPNO=[$0])
>   LogicalFilter(condition=[OR(CASE(IS NULL($2), false, =($1, false), 
> null:BOOLEAN, IS NOT NULL($1), true, false), CASE(IS NULL(*$2*), false, =($1, 
> false), null:BOOLEAN, IS NOT NULL($1), true, false))])
>     LogicalJoin(condition=[true], joinType=[left])
>       LogicalJoin(condition=[true], joinType=[left])
>         LogicalProject(EMPNO=[$0])
>           LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
>         LogicalSort(sort0=[$0], dir0=[DESC], fetch=[1])
>           LogicalAggregate(group=[{0}], c=[COUNT()])
>             LogicalProject(cs=[IS NOT NULL($0)])
>               LogicalFilter(condition=[OR(=(1, $0), IS NULL($0))])
>                 LogicalProject(DEPTNO=[$0])
>                   LogicalFilter(condition=[=($1, 'dept1')])
>                     LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
>       LogicalSort(sort0=[$0], dir0=[DESC], fetch=[1])
>         LogicalAggregate(group=[{0}], c=[COUNT()])
>           LogicalProject(cs=[IS NOT NULL($0)])
>             LogicalFilter(condition=[OR(=(2, $0), IS NULL($0))])
>               LogicalProject(DEPTNO=[$0])
>                 LogicalFilter(condition=[=($1, 'dept2')])
>                   LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
> {code}
> The wrong part is that when build the downstream LogicalFilter for the two 
> sub-queries, the filter for the second sub-query is CASE(IS NULL($2), false, 
> =($1, false), null:BOOLEAN, IS NOT NULL($1), true, false), notice that *$2 
> should be the second sub-query's intermediate table field dt.c(which field 
> index is $4), and $1 should be the second sub-query's intermediate table 
> field dt.cs(which field index is $3), but now the actual reference is the 
> first sub-query's, this leads to wrong plan, and wrong result*.
> The root cause is that intermediate table alias is the same as the previous 
> sub-query's, but when lookup intermediate table field, it always returns the 
> previous one which is not belong to the current subquery.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to