[ 
https://issues.apache.org/jira/browse/CALCITE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491017#comment-16491017
 ] 

Julian Hyde commented on CALCITE-2329:
--------------------------------------

Note that if there is a NULL value for {{deptno}} the IN sub-query will return 
UNKNOWN and the EXISTS sub-query will return FALSE. It doesn't matter in this 
case, because both of them are inside WHERE, which treats UNKNOWN and FALSE the 
same way. But if they were inside NOT, say, it would matter.

So yes, we should do better in this case, but make sure we continue to do the 
right thing when null semantics matter.

> Enhance SubQueryRemoveRule to rewrite IN operator with the constant from the 
> left side more optimally
> -----------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-2329
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2329
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Volodymyr Vysotskyi
>            Assignee: Julian Hyde
>            Priority: Major
>
> Currently, for queries like this:
> {code:sql}
> select sal from emp where 2 IN (select deptno from dept)
> {code}
> SubQueryRemoveRule rules expand query plan in such a way:
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalJoin(condition=[=(2, $9)], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalProject(DEPTNO=[$0])
>         LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> Since join condition is actually a filter condition, it will be pushed into 
> the filter during further planning stages and join with the true condition is 
> left.
> But these types of the queries may be rewritten in the same way as EXISTS 
> queries:
> {code:sql}
> select sal from emp where EXISTS (select deptno from dept where deptno=2)
> {code}
> with the more optimal plan:
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalJoin(condition=[true], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalAggregate(group=[{0}])
>         LogicalProject(i=[true])
>           LogicalFilter(condition=[=($0, 2)])
>             LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to