[jira] [Commented] (CALCITE-6650) Optimize the IN sub-query and SOME sub-query by Metadata RowCount

xiong duan (Jira) Wed, 30 Oct 2024 05:31:50 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17894188#comment-17894188
 ]


xiong duan commented on CALCITE-6650:
-------------------------------------

Hi [~zabetak] . The PruneEmptyRules can't always get the best plan when the 
sub-query returns 0 rows.

For example:
{code:java}
select * from emp e where empno > some (select 1 from dept where false){code}
Before optimize:
{code:java}
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
    LogicalFilter(condition=[OR(AND(IS TRUE(>($0, $9)), <>($10, 0)), AND(>($10, 
$11), null, <>($10, 0), IS NOT TRUE(>($0, $9))), AND(>($0, $9), <>($10, 0), IS 
NOT TRUE(>($0, $9)), <=($10, $11)))])
      LogicalJoin(condition=[true], joinType=[inner])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
        LogicalProject(m=[$0], c=[$1], d=[$1])
          LogicalAggregate(group=[{}], m=[MIN($0)], c=[COUNT()])
            LogicalProject(EXPR$0=[1])
              LogicalFilter(condition=[false])
                LogicalTableScan(table=[[CATALOG, SALES, DEPT]]){code}
After optimize:
{code:java}
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
    LogicalFilter(condition=[OR(AND(>($0, $9), <>($10, 0)), AND(>($0, $9), 
<>($10, 0), IS NOT TRUE(>($0, $9)), <=($10, $11)))])
      LogicalJoin(condition=[true], joinType=[inner])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
        LogicalProject(m=[$0], c=[$1], d=[$1])
          LogicalAggregate(group=[{}], m=[MIN($0)], c=[COUNT()])
            LogicalValues(tuples=[[]]){code}
Because the Rule can't convert an Aggregate to empty because an aggregate 
without a GROUP BY key always returns 1 row, even over empty input.

> Optimize the IN sub-query and SOME sub-query by Metadata RowCount
> -----------------------------------------------------------------
>
>                 Key: CALCITE-6650
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6650
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: xiong duan
>            Priority: Major
>              Labels: pull-request-available
>
> As same as the UNIQUE sub-query, before we convert the sub-query to RelNode. 
> we use the Metadata to optimize it.
>  * IN sub-query, If the sub-query is guaranteed to produce no row, just 
> return FALSE.
>  * SOME sub-query, If the sub-query is guaranteed to produce no row, just 
> return FALSE.
> SQL Example:
> {code:java}
> select * from dept as d where deptno not in (select count(*) from emp e 
> having false)
> Before this PR:
> LogicalProject(DEPTNO=[$0], NAME=[$1])
>   LogicalProject(DEPTNO=[$0], NAME=[$1])
>     LogicalFilter(condition=[OR(=($2, 0), AND(IS NULL($5), >=($3, $2)))])
>       LogicalJoin(condition=[=(CAST($0):BIGINT NOT NULL, $4)], 
> joinType=[left])
>         LogicalJoin(condition=[true], joinType=[inner])\n          
>           LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>           LogicalProject(c=[$0], ck=[$0])
>             LogicalAggregate(group=[{}], c=[COUNT()])
>               LogicalValues(tuples=[[]])
>         LogicalProject(EXPR$0=[$0], i=[true])
>           LogicalValues(tuples=[[]])\n
> After this PR:
> LogicalProject(DEPTNO=[$0], NAME=[$1])
>   LogicalTableScan(table=[[CATALOG, SALES, DEPT]]) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-6650) Optimize the IN sub-query and SOME sub-query by Metadata RowCount

Reply via email to