[
https://issues.apache.org/jira/browse/CALCITE-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15672317#comment-15672317
]
Vineet Garg commented on CALCITE-1493:
--------------------------------------
I notice that the plan just after SubqueryRemoveRule consists of
LogicalCorrelate with LEFT joinType and a LogicalFilter corresponding to WHERE
IN subquery column on top of it. While de-correlating this LogicalCorrelate
decorrelation logic appropriately creates LEFT JOIN with condition as
conjunction of co-related variables, but it leaves LogicalFilter on top of it.
After de-correlation decorrelateRel calls findBestExp where in this
LogicalFilter + LogicalJoin(LEFT) is converted to INNER JOIN.
I believe creating LogicalFilter on top of LogicalCorrelate is wrong. If
LogicalCorrelate is of type LEFT then this LogicalFilter should rather be a
part of LogicalCorrelate as a condition. (Although it looks like current design
doesn't allow a condition with LogicalCorrelate ).
[~julianhyde] Any thoughts on this ?
> Wrong plan for NOT IN correlated queries
> ----------------------------------------
>
> Key: CALCITE-1493
> URL: https://issues.apache.org/jira/browse/CALCITE-1493
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: Vineet Garg
> Assignee: Julian Hyde
>
> Plan generated by calcite with SubqueryRemoveRule followed by de-correlation
> for the following query:
> {noformat} select sal from emp where empno NOT IN (select deptno from dept
> where emp.job = dept.name) {noformat}
> is
> {noformat}
> LogicalProject(SAL=[$5])
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalFilter(condition=[IS NULL($11)])
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3],
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8],
> DEPTNO0=[CAST($9):INTEGER], JOB0=[CAST($10):VARCHAR(10) CHARACTER SET
> "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f2=[CAST($11):BOOLEAN])
> LogicalJoin(condition=[AND(=($2, $10), =($0, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalProject(DEPTNO=[$0], JOB=[$1], $f2=[true])
> LogicalAggregate(group=[{0, 1}])
> LogicalProject(DEPTNO=[$0], JOB=[$2], i=[$1])
> LogicalProject(DEPTNO=[$0], i=[true], JOB=[$1])
> LogicalProject(DEPTNO=[$0], JOB=[$2])
> LogicalJoin(condition=[=($2, $1)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalAggregate(group=[{0}])
> LogicalProject(JOB=[$2])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {noformat}
> As you can notice instead of doing Left Outer Join Calcite is doing Inner
> Join. This will produce wrong results.
> Plan for same query just before SubqueryRemove Rule is:
> {noformat}
> LogicalProject(SAL=[$5])
> LogicalFilter(condition=[NOT(IN($0, {
> LogicalProject(DEPTNO=[$0])
> LogicalFilter(condition=[=($cor0.JOB, $1)])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> }))], variablesSet=[[$cor0]])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {noformat}
> Plan just after SubqueryRemove Rule:
> {noformat}
> LogicalProject(SAL=[$5])
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalFilter(condition=[IS NULL($10)])
> LogicalFilter(condition=[=($0, $9)])
> LogicalCorrelate(correlation=[$cor0], joinType=[LEFT],
> requiredColumns=[{2}])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
> LogicalProject(DEPTNO=[$0], i=[true])
> LogicalProject(DEPTNO=[$0])
> LogicalFilter(condition=[=($cor0.JOB, $1)])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> Looking at above it seems RelDecorrelator have some issue where it is coming
> up with Inner Join.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)