[jira] [Commented] (CALCITE-2936) Simplify EXISTS or NOT EXISTS sub-query that has "GROUP BY ()"

Haisheng Yuan (JIRA) Fri, 22 Mar 2019 12:00:16 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799268#comment-16799268
 ]


Haisheng Yuan commented on CALCITE-2936:
----------------------------------------

[~danny0405] Let's continue discussion here. I understand that the description 
in the code concerns you, but I think the comments are misleading, the word 
should be determine, not estimate. The minRowCount and maxRowCount are provided 
to help determine whether we can do further optimization, like aggregate / sort 
removal, existential check, not for cardinality and cost estimation. I don't 
how much value it will provide if it is an estimate value. Should we update the 
comments If I don't misunderstand the intention?

> Simplify EXISTS or NOT EXISTS sub-query that has "GROUP BY ()"
> --------------------------------------------------------------
>
>                 Key: CALCITE-2936
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2936
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: Haisheng Yuan
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> An EXISTS or NOT EXISTS sub-query whose inner child is an aggregate with no 
> grouping columns should be simplified to a Boolean constant.
> Example:
> {code:java}
> exists(select sum(i) from X) --> true
> not exists(select sum(i) from X) --> false
> {code}
> Repro:
> {code:java}
> @Test public void testExistentialSubquery() {
>     final String sql = "SELECT e1.empno\n"
>         + "FROM emp e1 where exists\n"
>         + "(select avg(sal) from emp e2 where e1.empno = e2.empno )";
>     sql(sql).decorrelate(true).ok();
>   }
> {code}
> We got plan:
> {code:java}
> LogicalProject(EMPNO=[$0])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EMPNO0=[CAST($9):INTEGER], 
> $f1=[CAST($10):BOOLEAN])
>     LogicalJoin(condition=[=($0, $9)], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
>         LogicalProject(EMPNO=[$0], $f0=[true])
>           LogicalAggregate(group=[{0}], EXPR$0=[AVG($1)])
>             LogicalProject(EMPNO=[$0], SAL=[$5])
>               LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> The preferred plan should be:
> {code:java}
> LogicalProject(EMPNO=[$0])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-2936) Simplify EXISTS or NOT EXISTS sub-query that has "GROUP BY ()"

Reply via email to