[ 
https://issues.apache.org/jira/browse/CALCITE-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821775#comment-16821775
 ] 

Danny Chan commented on CALCITE-1624:
-------------------------------------

Hei, [~vgarg], when i attach the test case to SqlToRelConverter:
{code:java}
@Test public void testNotInSubquery() {
  final String sql = "select sal from emp\n"
      + "where empno NOT IN (\n"
      + "  select deptno from dept\n"
      + "  where emp.job = dept.name)";
  sql(sql).ok();
}{code}
This plan will produce:
{code:java}
LogicalProject(SAL=[$5])
  LogicalFilter(condition=[IS NOT TRUE($12)])
    LogicalJoin(condition=[AND(=($2, $11), =($9, $10))], joinType=[left])
      LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5],      COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EMPNO0=[$0])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
        LogicalProject(DEPTNO=[$0], NAME=[$1], $f1=[true])
          LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}
Which i think is wrong, cause `IS NOT TRUE($12)` is always false, can you 
please help to make sure ?

 

> Inefficient plan for NOT IN correlated subqueries
> -------------------------------------------------
>
>                 Key: CALCITE-1624
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1624
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Vineet Garg
>            Assignee: Danny Chan
>            Priority: Major
>
> I just noticed that {{NOT IN}} correlated subqueries produces an extra 
> un-neccessary join after de-correlation (this is an addition to un-necessary 
> joins reported in CALCITE-1494)
> Query
> {code:SQL}
> select sal from emp
> where empno NOT IN (
>   select deptno from dept
>   where emp.job = dept.name)
> {code}
> Plan after subquery remove rule:
> {code}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalFilter(condition=[NOT(CASE(=($9, 0), false, IS NOT NULL($12), 
> true, <($10, $9), true, false))])
>       LogicalCorrelate(correlation=[$cor0], joinType=[LEFT], 
> requiredColumns=[{2}])
>         LogicalCorrelate(correlation=[$cor0], joinType=[LEFT], 
> requiredColumns=[{2}])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>           LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)])
>             LogicalProject(DEPTNO=[$0])
>               LogicalFilter(condition=[=($cor0.JOB, $1)])
>                 LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>         LogicalFilter(condition=[=($cor0.EMPNO, $0)])
>           LogicalAggregate(group=[{0, 1}])
>             LogicalProject(DEPTNO=[$0], i=[true])
>               LogicalProject(DEPTNO=[$0])
>                 LogicalFilter(condition=[=($cor0.JOB, $1)])
>                   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> Plan after de-correlation
> {code}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalFilter(condition=[NOT(CASE(=($10, 0), false, IS NOT NULL($14), 
> true, <($11, $10), true, false))])
>       LogicalJoin(condition=[AND(=($0, $15), =($2, $13))], joinType=[left])
>         LogicalJoin(condition=[=($2, $9)], joinType=[left])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>           LogicalAggregate(group=[{0}], c=[COUNT()], ck=[COUNT($1)])
>             LogicalProject(JOB=[$1], DEPTNO=[$0])
>               LogicalProject(DEPTNO=[$0], JOB=[$2])
>                 LogicalJoin(condition=[=($2, $1)], joinType=[inner])
>                   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>                   LogicalAggregate(group=[{0}])
>                     LogicalProject(JOB=[$2])
>                       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>         LogicalJoin(condition=[=($3, $0)], joinType=[inner]) // <== 
> Un-necessary join
>           LogicalProject(DEPTNO=[$0], JOB=[$1], $f2=[true])
>             LogicalAggregate(group=[{0, 1}])
>               LogicalProject(DEPTNO=[$0], JOB=[$2], i=[$1])
>                 LogicalProject(DEPTNO=[$0], i=[true], JOB=[$1])
>                   LogicalProject(DEPTNO=[$0], JOB=[$2])
>                     LogicalJoin(condition=[=($2, $1)], joinType=[inner])
>                       LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>                       LogicalAggregate(group=[{0}])
>                         LogicalProject(JOB=[$2])
>                           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>           LogicalAggregate(group=[{0}])
>             LogicalProject(EMPNO=[$0])
>               LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> As you can see in plan after de-correlation there is an un-necessary inner 
> join.
> This is not reproducible on CALCITE-1494's branch. But since this is a 
> separate issue from CALCITE-1494 I decided to open a separate JIRA.
> Feel free to mark is duplicate or close it if you think otherwise.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to