[
https://issues.apache.org/jira/browse/CALCITE-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755133#comment-17755133
]
LakeShen commented on CALCITE-5880:
-----------------------------------
[~julianhyde] ,[~leepb] ,When I use the SQL debug below, I find that the
Condition of the generated Join in the
RelDecorrelator#decorrelateRel(Correlate, boolean) method is wrong:
{code:java}
select EMPNO
from EMPNULLABLES
where EXISTS (select DEPTNO from EMPNULLABLES dept where EMPNULLABLES.EMPNO is
not DISTINCT from dept.DEPTNO) {code}
In RelOptRulesTest,code like this:
{code:java}
final String sql = "select EMPNO\n" +
"from EMPNULLABLES\n" +
"where EXISTS (select DEPTNO from EMPNULLABLES dept where
EMPNULLABLES.EMPNO is not " +
"DISTINCT from dept.DEPTNO) ";
sql(sql)
.withExpand(true)
.withLateDecorrelate(true)
.withRule()
.check();
{code}
After the decorrelate,the plan is :
{code:java}
LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[IS NOT NULL($9)])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$13])
LogicalJoin(condition=[AND(=($0, $11), =($9, $12))], joinType=[left])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3],
HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f9=[false],
EMPNO0=[$0])
LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
LogicalProject(EMPNO0=[$9], $f9=[$10], $f0=[true])
LogicalJoin(condition=[OR(AND($10, IS NULL($7)), =($9, $7))],
joinType=[inner])
LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
LogicalProject(EMPNO=[$0], $f9=[false])
LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
{code}
Look at he condition of the top LogicalJoin:
{code:java}
LogicalJoin(condition=[AND(=($0, $11), =($9, $12))], joinType=[left])
{code}
It should be :
{code:java}
LogicalJoin(condition=[OR(=($0, $11), =($9, $12))], joinType=[left])
{code}
$0 represents the EMPNO column in the left table EMPNULLABLES
$11 represents the EMPNO column in the right table EMPNULLABLES
$9 is the false with left table EMPNULLABLES
$12 is the false with right table EMPNULLABLES
when the $0 is null and $11 is null, the condition
LogicalJoin(condition=[AND(=($0, $11), =($9, $12))], joinType=[left]) not
match,so right column would be null.
But the top LogicalFilter(condition=[IS NOT NULL($9)]) filters out NULL values.
> When the association condition of the association subquery is 'is not
> distinct from', the join condition becomes '=' after decorrelation.
> -----------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CALCITE-5880
> URL: https://issues.apache.org/jira/browse/CALCITE-5880
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: libopeng
> Assignee: LakeShen
> Priority: Major
> Fix For: 1.36.0
>
> Attachments: image-2023-08-01-15-20-22-105.png,
> image-2023-08-02-10-15-00-455.png, image-2023-08-13-21-36-00-494.png
>
>
> {code:java}
> select EMPNO
> from emp
> where EXISTS (select DEPTNO from dept where emp.EMPNO is not DISTINCT from
> dept.DEPTNO){code}
> before decorrelation
> {code:java}
> LogicalProject(EMPNO=[$0])
> LogicalFilter(condition=[IS NOT NULL($9)])
> LogicalCorrelate(correlation=[$cor0], joinType=[left],
> requiredColumns=[{0}])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[MIN($0)])
> LogicalProject($f0=[true])
> LogicalFilter(condition=[OR(AND(IS NULL($cor0.EMPNO), IS NULL($0)),
> IS TRUE(=($cor0.EMPNO, $0)))]) <--- here is 't1.x is not distinct from
> t2.x'
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> after decorrelation
> {code:java}
> LogicalProject(EMPNO=[$0])
> LogicalFilter(condition=[IS NOT NULL($9)])
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$13])
> LogicalJoin(condition=[AND(=($0, $11), =($9, $12))], joinType=[left])
> <---here is 't1.x=t2.x and (t1.x is null = t2.x is null)'
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3],
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f9=[false],
> EMPNO0=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
> LogicalProject(EMPNO=[$2], $f9=[$3], $f0=[true])
> LogicalJoin(condition=[=($2, $0)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalProject(EMPNO=[$0], $f9=[false])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
>
> {code:java}
> EMP | DEPT
> +---------+ | +---------+
> | EMPNO | | | DEPTNO |
> +---------+ | +---------+
> | null | | | null |
> {code}
> expect result
> {code:java}
> EMPNO
> +-------+
> | null |
> +-------+{code}
> actual result
> {code:java}
> EMPNO
> +-------+
> +-------+ {code}
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)