[ 
https://issues.apache.org/jira/browse/CALCITE-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717071#comment-15717071
 ] 

Vineet Garg commented on CALCITE-1493:
--------------------------------------

{code}
+    LogicalFilter(condition=[IS NULL($11)])
+      LogicalJoin(condition=[AND(=($0, $12), =($2, $10))], joinType=[left])
+        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
+        LogicalJoin(condition=[=($3, $0)], joinType=[inner]) 
<======================================Why is this required ?
+          LogicalProject(DEPTNO=[$0], JOB=[$1], $f2=[true])
+            LogicalAggregate(group=[{0, 1}])
+              LogicalProject(DEPTNO=[$0], JOB=[$2], i=[$1])
+                LogicalProject(DEPTNO=[$0], i=[true], JOB=[$1])
+                  LogicalProject(DEPTNO=[$0], JOB=[$2])
+                    LogicalJoin(condition=[=($2, $1)], joinType=[inner])
+                      LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
+                      LogicalAggregate(group=[{0}])
+                        LogicalProject(JOB=[$2])
+                          LogicalTableScan(table=[[CATALOG, SALES, EMP]])
+          LogicalAggregate(group=[{0}])       
+            LogicalProject(EMPNO=[$0])
+              LogicalTableScan(table=[[CATALOG, SALES, EMP]])
 ]]>
{code}

Thanks Julian. I am wondering why is INNER JOIN in above plan required, where 
inner query is joined with distinct empno with condition *empno=deptno*. This 
is effectively making one of the left outer join condition *empno=deptno* 
always true. Not sure if this is semantically correct.

> Wrong plan for NOT IN correlated queries
> ----------------------------------------
>
>                 Key: CALCITE-1493
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1493
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Vineet Garg
>            Assignee: Julian Hyde
>              Labels: sub-query
>         Attachments: CALCITE-1493.test.patch
>
>
> Plan generated by calcite with SubqueryRemoveRule followed by de-correlation 
> for the following query:
> {noformat} select sal from emp where empno NOT IN (select deptno from dept 
> where emp.job = dept.name) {noformat}
> is
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalFilter(condition=[IS NULL($11)])
>       LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], 
> DEPTNO0=[CAST($9):INTEGER], JOB0=[CAST($10):VARCHAR(10) CHARACTER SET 
> "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f2=[CAST($11):BOOLEAN])
>         LogicalJoin(condition=[AND(=($2, $10), =($0, $9))], joinType=[inner])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>           LogicalProject(DEPTNO=[$0], JOB=[$1], $f2=[true])
>             LogicalAggregate(group=[{0, 1}])
>               LogicalProject(DEPTNO=[$0], JOB=[$2], i=[$1])
>                 LogicalProject(DEPTNO=[$0], i=[true], JOB=[$1])
>                   LogicalProject(DEPTNO=[$0], JOB=[$2])
>                     LogicalJoin(condition=[=($2, $1)], joinType=[inner])
>                       LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>                       LogicalAggregate(group=[{0}])
>                         LogicalProject(JOB=[$2])
>                           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {noformat}
> As you can notice instead of doing Left Outer Join Calcite is doing Inner 
> Join. This will produce wrong results.
> Plan for same query just before SubqueryRemove Rule is:
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalFilter(condition=[NOT(IN($0, {
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[=($cor0.JOB, $1)])
>     LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> }))], variablesSet=[[$cor0]])
>     LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {noformat}
> Plan just after SubqueryRemove Rule:
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalFilter(condition=[IS NULL($10)])
>       LogicalFilter(condition=[=($0, $9)])
>         LogicalCorrelate(correlation=[$cor0], joinType=[LEFT], 
> requiredColumns=[{2}])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>           LogicalAggregate(group=[{0, 1}])
>             LogicalProject(DEPTNO=[$0], i=[true])
>               LogicalProject(DEPTNO=[$0])
>                 LogicalFilter(condition=[=($cor0.JOB, $1)])
>                   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> Looking at above it seems RelDecorrelator have some issue where it is coming 
> up with Inner Join.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to