[ 
https://issues.apache.org/jira/browse/CALCITE-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790152#comment-16790152
 ] 

Haisheng Yuan commented on CALCITE-2857:
----------------------------------------

A simpler repro query would be:

{code:sql}
select * from dept where exists (select * from emp where emp.deptno <> 
dept.deptno)
{code}

We got an initial logic plan:
{code:java}
LogicalProject(DEPTNO=[$0], NAME=[$1])
  LogicalProject(DEPTNO=[$0], NAME=[$1], DEPTNO0=[CAST($2):INTEGER], 
$f1=[CAST($3):BOOLEAN])
    LogicalJoin(condition=[=($0, $2)], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
      LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
        LogicalProject(DEPTNO0=[$9], $f0=[true])
          LogicalJoin(condition=[<>($7, $9)], joinType=[inner])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
            LogicalProject(DEPTNO=[$0])
              LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}

Which is indeed complicated.


> SemiJoin extends Join directly instead of EquiJoin to support non-equi 
> condition
> --------------------------------------------------------------------------------
>
>                 Key: CALCITE-2857
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2857
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: godfrey he
>            Priority: Major
>
> SemiJoin extends Join directly instead of EquiJoin to support non-equi 
> condition, 
>  e.g.
> {code:sql}
> select * from (select * from dept where dept.deptno in (select emp.deptno 
> from emp where emp.job <> dept.name))R where R.deptno <= 10
> {code}
> Currently, this query will be converted to
> {code}
> SemiJoin(condition=[AND(=($1, $3), =($0, $2))], joinType=[inner])
>   LogicalFilter(condition=[<=($0, 10)])
>     LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>   LogicalProject(DEPTNO=[$7], NAME=[$9])
>     LogicalJoin(condition=[<>($2, $9)], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalAggregate(group=[{0}])
>         LogicalProject(NAME=[$1])
>           LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> however the following plan is more efficient than the above one:
> {code}
> LogicalProject(DEPTNO=[$0], NAME=[$1])
>   LogicalFilter(condition=[<=($0, 10)])
>     SemiJoin(condition=[=AND(=($0, $3), <>($1, $2))], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>       LogicalProject(JOB=[$2], DEPTNO=[$7])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to