[ 
https://issues.apache.org/jira/browse/CALCITE-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17749512#comment-17749512
 ] 

LakeShen commented on CALCITE-5880:
-----------------------------------

Hi [~leepb],the difference between is not distinct from and the = is that it 
takes into account the case that Null = Null.

I'd like to know what the data types are for emp.EMPNO and dep.deptno in your 
SQL example,if emp.EMPNO and dep.deptno datatype's isNullable is false,It says 
there will be no null value.Then the above plan is right.You could see that 
like this :

!image-2023-08-01-15-20-22-105.png|width=409,height=169!

I used the tables EMPNULLABLES and EMPDEFAULTS, whose fields DEPTNO isNullable 
is true, and the Schema information is in MockCatalogReaderSimple. The SQL I 
tested was as follows:
{code:java}
select EMPNO from EMPNULLABLES A where EXISTS (select DEPTNO from EMPDEFAULTS B 
where A.DEPTNO is not DISTINCT from B.DEPTNO) {code}
before decorrelation:
{code:java}
LogicalProject(EMPNO=[$0])
  LogicalFilter(condition=[IS NOT NULL($9)])
    LogicalCorrelate(correlation=[$cor0], joinType=[left], 
requiredColumns=[{7}])
      LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
      LogicalAggregate(group=[{}], agg#0=[MIN($0)])
        LogicalProject($f0=[true])
          LogicalFilter(condition=[OR(AND(IS NULL($cor0.DEPTNO), IS NULL($7)), 
IS TRUE(=($cor0.DEPTNO, $7)))])
            LogicalTableScan(table=[[CATALOG, SALES, EMPDEFAULTS]]) {code}
after decorrelation:
{code:java}
LogicalProject(EMPNO=[$0])
  LogicalFilter(condition=[IS NOT NULL($9)])
    LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$13])
      LogicalJoin(condition=[AND(=($7, $11), =($9, $12))], joinType=[left])
        LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f9=[IS 
NULL($7)], DEPTNO0=[$7])
          LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
        LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
          LogicalProject(DEPTNO0=[$9], $f9=[$10], $f0=[true])
            LogicalJoin(condition=[OR(AND($10, IS NULL($7)), =($9, $7))], 
joinType=[inner])
              LogicalTableScan(table=[[CATALOG, SALES, EMPDEFAULTS]])
              LogicalAggregate(group=[{0, 1}])
                LogicalProject(DEPTNO=[$7], $f9=[IS NULL($7)])
                  LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]]) 
{code}
In the cases I tested, the plan was correct.

> When the association condition of the association subquery is 'is not 
> distinct from', the join condition becomes '=' after decorrelation.
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5880
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5880
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: libopeng
>            Priority: Major
>         Attachments: image-2023-08-01-15-20-22-105.png
>
>
> {code:java}
> select EMPNO 
> from emp
> where EXISTS (select DEPTNO from dept where emp.EMPNO is not DISTINCT from 
> dept.DEPTNO){code}
> before decorrelation
> {code:java}
> LogicalProject(EMPNO=[$0])
>   LogicalFilter(condition=[IS NOT NULL($9)])
>     LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{0}])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalAggregate(group=[{}], agg#0=[MIN($0)])
>         LogicalProject($f0=[true])
>           LogicalFilter(condition=[OR(AND(IS NULL($cor0.EMPNO), IS NULL($0)), 
> IS TRUE(=($cor0.EMPNO, $0)))])    <---   here is 't1.x is not distinct from 
> t2.x'
>             LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> after decorrelation
> {code:java}
> LogicalProject(EMPNO=[$0])
>   LogicalFilter(condition=[IS NOT NULL($9)])
>     LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$13])
>       LogicalJoin(condition=[AND(=($0, $11), =($9, $12))], joinType=[left])   
>      <---here is 't1.x=t2.x and (t1.x is null = t2.x is null)'
>         LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f9=[false], 
> EMPNO0=[$0])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>         LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
>           LogicalProject(EMPNO=[$2], $f9=[$3], $f0=[true])
>             LogicalJoin(condition=[=($2, $0)], joinType=[inner])
>               LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>               LogicalProject(EMPNO=[$0], $f9=[false])
>                 LogicalTableScan(table=[[CATALOG, SALES, EMP]])  {code}
>  
> {code:java}
>     EMP               |         DEPT
> +---------+           |      +---------+
> | EMPNO  |            |      | DEPTNO |
> +---------+           |      +---------+
> | null |              |      | null   |
>            {code}
> expect result
> {code:java}
>   EMPNO
> +-------+
> |  null |
> +-------+{code}
> actual result
> {code:java}
>   EMPNO
> +-------+
> +-------+ {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to