[ 
https://issues.apache.org/jira/browse/CALCITE-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755133#comment-17755133
 ] 

LakeShen commented on CALCITE-5880:
-----------------------------------

[~julianhyde] ,[~leepb] ,When I use the SQL debug below, I find that the 
Condition of the generated Join in the 
RelDecorrelator#decorrelateRel(Correlate, boolean) method is wrong:
{code:java}
select EMPNO
from EMPNULLABLES
where EXISTS (select DEPTNO from  EMPNULLABLES dept where EMPNULLABLES.EMPNO is 
not DISTINCT from dept.DEPTNO)  {code}
In RelOptRulesTest,code like this:
{code:java}
final String sql = "select EMPNO\n" +
        "from EMPNULLABLES\n" +
        "where EXISTS (select DEPTNO from  EMPNULLABLES dept where 
EMPNULLABLES.EMPNO is not " +
        "DISTINCT from dept.DEPTNO) ";
    sql(sql)
        .withExpand(true)
        .withLateDecorrelate(true)
        .withRule()
        .check();
{code}
After the decorrelate,the plan is :
{code:java}
LogicalProject(EMPNO=[$0])
  LogicalFilter(condition=[IS NOT NULL($9)])
    LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$13])
      LogicalJoin(condition=[AND(=($0, $11), =($9, $12))], joinType=[left])
        LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f9=[false], 
EMPNO0=[$0])
          LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
        LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
          LogicalProject(EMPNO0=[$9], $f9=[$10], $f0=[true])
            LogicalJoin(condition=[OR(AND($10, IS NULL($7)), =($9, $7))], 
joinType=[inner])
              LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
              LogicalProject(EMPNO=[$0], $f9=[false])
                LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
{code}
Look at he condition of the top LogicalJoin:
{code:java}
 LogicalJoin(condition=[AND(=($0, $11), =($9, $12))], joinType=[left])
{code}
It should be :
{code:java}
 LogicalJoin(condition=[OR(=($0, $11), =($9, $12))], joinType=[left])
{code}
$0 represents the EMPNO column in the left table EMPNULLABLES
$11 represents the EMPNO column in the right table EMPNULLABLES
$9 is the false with left table EMPNULLABLES
$12 is the false with right table  EMPNULLABLES

when the $0 is null and $11 is null, the condition  
LogicalJoin(condition=[AND(=($0, $11), =($9, $12))], joinType=[left]) not 
match,so right column would be null.

But the top LogicalFilter(condition=[IS NOT NULL($9)]) filters out NULL values.

> When the association condition of the association subquery is 'is not 
> distinct from', the join condition becomes '=' after decorrelation.
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5880
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5880
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: libopeng
>            Assignee: LakeShen
>            Priority: Major
>             Fix For: 1.36.0
>
>         Attachments: image-2023-08-01-15-20-22-105.png, 
> image-2023-08-02-10-15-00-455.png, image-2023-08-13-21-36-00-494.png
>
>
> {code:java}
> select EMPNO 
> from emp
> where EXISTS (select DEPTNO from dept where emp.EMPNO is not DISTINCT from 
> dept.DEPTNO){code}
> before decorrelation
> {code:java}
> LogicalProject(EMPNO=[$0])
>   LogicalFilter(condition=[IS NOT NULL($9)])
>     LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{0}])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalAggregate(group=[{}], agg#0=[MIN($0)])
>         LogicalProject($f0=[true])
>           LogicalFilter(condition=[OR(AND(IS NULL($cor0.EMPNO), IS NULL($0)), 
> IS TRUE(=($cor0.EMPNO, $0)))])    <---   here is 't1.x is not distinct from 
> t2.x'
>             LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> after decorrelation
> {code:java}
> LogicalProject(EMPNO=[$0])
>   LogicalFilter(condition=[IS NOT NULL($9)])
>     LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$13])
>       LogicalJoin(condition=[AND(=($0, $11), =($9, $12))], joinType=[left])   
>      <---here is 't1.x=t2.x and (t1.x is null = t2.x is null)'
>         LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f9=[false], 
> EMPNO0=[$0])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>         LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
>           LogicalProject(EMPNO=[$2], $f9=[$3], $f0=[true])
>             LogicalJoin(condition=[=($2, $0)], joinType=[inner])
>               LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>               LogicalProject(EMPNO=[$0], $f9=[false])
>                 LogicalTableScan(table=[[CATALOG, SALES, EMP]])  {code}
>  
> {code:java}
>     EMP               |         DEPT
> +---------+           |      +---------+
> | EMPNO  |            |      | DEPTNO |
> +---------+           |      +---------+
> | null |              |      | null   |
>            {code}
> expect result
> {code:java}
>   EMPNO
> +-------+
> |  null |
> +-------+{code}
> actual result
> {code:java}
>   EMPNO
> +-------+
> +-------+ {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to