[ 
https://issues.apache.org/jira/browse/CALCITE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756798#comment-16756798
 ] 

Julian Hyde commented on CALCITE-2329:
--------------------------------------

The "more optimal plan" pushes the "deptno=2" filter further towards the table 
scan, whereas the original plan has it as a join condition. In the "more 
optimal plan" there is "select distinct true", so the Aggregate is needed to 
eliminate duplicates. It could be smart enough deduce that at most one row will 
satisfy "deptno = 2" (because deptno is the primary key), and if it could 
deduce that, it could remove the Aggregate, but it isn't.

> Enhance SubQueryRemoveRule to rewrite IN operator with the constant from the 
> left side more optimally
> -----------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-2329
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2329
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Volodymyr Vysotskyi
>            Assignee: Julian Hyde
>            Priority: Major
>             Fix For: 1.17.0
>
>
> Currently, for queries like this:
> {code:sql}
> select sal from emp where 2 IN (select deptno from dept)
> {code}
> SubQueryRemoveRule rules expand query plan in such a way:
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalJoin(condition=[=(2, $9)], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalProject(DEPTNO=[$0])
>         LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> Since join condition is actually a filter condition, it will be pushed into 
> the filter during further planning stages and join with the true condition is 
> left.
> But these types of the queries may be rewritten in the same way as EXISTS 
> queries:
> {code:sql}
> select sal from emp where EXISTS (select deptno from dept where deptno=2)
> {code}
> with the more optimal plan:
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalJoin(condition=[true], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalAggregate(group=[{0}])
>         LogicalProject(i=[true])
>           LogicalFilter(condition=[=($0, 2)])
>             LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to