Gautam Kumar Parai created CALCITE-1288:
-------------------------------------------

             Summary: Avoid doing the same join twice if count(distinct) exists
                 Key: CALCITE-1288
                 URL: https://issues.apache.org/jira/browse/CALCITE-1288
             Project: Calcite
          Issue Type: Improvement
            Reporter: Gautam Kumar Parai
            Assignee: Gautam Kumar Parai


When the query has one distinct aggregate and one or more non-distinct 
aggregates, the join instance need not produce the join-based plan. We can 
generate multi-phase aggregates.
{code}
select emp.empno, count(*), avg(distinct dept.deptno) 
from sales.emp emp inner join sales.dept dept 
on emp.deptno = dept.deptno 
group by emp.empno

LogicalProject(EMPNO=[$0], EXPR$1=[$1], EXPR$2=[$3])
  LogicalJoin(condition=[IS NOT DISTINCT FROM($0, $2)], joinType=[inner])
    LogicalAggregate(group=[{0}], EXPR$1=[COUNT()])
      LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
        LogicalJoin(condition=[=($7, $9)], joinType=[inner])
          LogicalTableScan(table=[[CATALOG, SALES, EMP]])
          LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
    LogicalAggregate(group=[{0}], EXPR$2=[AVG($1)])
      LogicalAggregate(group=[{0, 1}])
        LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
          LogicalJoin(condition=[=($7, $9)], joinType=[inner])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
            LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}
The more efficient form should look like 
{code}
select emp.empno, count(*), avg(distinct dept.deptno) 
from sales.emp emp inner join sales.dept dept 
on emp.deptno = dept.deptno 
group by emp.empno

LogicalAggregate(group=[{0}], EXPR$1=[SUM($2)], EXPR$2=[AVG($1)])
  LogicalAggregate(group=[{0, 1}], EXPR$1=[COUNT()])
    LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
      LogicalJoin(condition=[=($7, $9)], joinType=[inner])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
        LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to