[ 
https://issues.apache.org/jira/browse/CALCITE-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Crozon updated CALCITE-4037:
------------------------------------
    Description: 
When columns aggregated over are given an alias, but not aggregated values, the 
alias is lost: 
{code}
select deptno as x, sum(sal)
from emp
group by deptno
{code}
has the following plan
{code}
LogicalAggregate(group=[{0}], EXPR$1=[SUM($1)])
  LogicalProject(X=[$7], SAL=[$5])
    LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}
 which becomes  
{code}
 LogicalAggregate(group=[{7}], EXPR$1=[SUM($5)])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}
 

after the AggregateProjectMergeRule

 

I attempted a fix by comparing the row type's field names of the project node 
with its input, and skip merging with the agg node if they don't match. That 
works for the use case above, however that breaks quite a few unit tests. Some 
of them, I believe, should be updated like 

testAggregateMerge1, the alias in the SQL queries are lost in the final plan. 
But for other use cases, mostly when there's a join, the simple column name 
comparison is not enough. 

 

 

  was:
When columns aggregated over are given an alias, but not aggregated values, the 
alias is lost: 

{{select deptno as x, sum(sal)}}
 {{from emp}}
 {{group by deptno}}

has the following plan

{{LogicalAggregate(group=[\\{0}], EXPR$1=[SUM($1)])}}
 {{  LogicalProject(X=[$7], SAL=[$5])}}
 {{    LogicalTableScan(table=[[CATALOG, SALES, EMP]])}}

 which becomes  

{{LogicalAggregate(group=[\\{7}], EXPR$1=[SUM($5)])}}
 {{  LogicalTableScan(table=[[CATALOG, SALES, EMP]])}}

after the AggregateProjectMergeRule

 

I attempted a fix by comparing the row type's field names of the project node 
with its input, and skip merging with the agg node if they don't match. That 
works for the use case above, however that breaks quite a few unit tests. Some 
of them, I believe, should be updated like 

testAggregateMerge1, the alias in the SQL queries are lost in the final plan. 
But for other use cases, mostly when there's a join, the simple column name 
comparison is not enough. 

 

 


> AggregateProjectMergeRule doesn't always respect column aliases
> ---------------------------------------------------------------
>
>                 Key: CALCITE-4037
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4037
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Sylvain Crozon
>            Priority: Major
>         Attachments: fix-AggregateProjectMergeRule.patch
>
>
> When columns aggregated over are given an alias, but not aggregated values, 
> the alias is lost: 
> {code}
> select deptno as x, sum(sal)
> from emp
> group by deptno
> {code}
> has the following plan
> {code}
> LogicalAggregate(group=[{0}], EXPR$1=[SUM($1)])
>   LogicalProject(X=[$7], SAL=[$5])
>     LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
>  which becomes  
> {code}
>  LogicalAggregate(group=[{7}], EXPR$1=[SUM($5)])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
>  
> after the AggregateProjectMergeRule
>  
> I attempted a fix by comparing the row type's field names of the project node 
> with its input, and skip merging with the agg node if they don't match. That 
> works for the use case above, however that breaks quite a few unit tests. 
> Some of them, I believe, should be updated like 
> testAggregateMerge1, the alias in the SQL queries are lost in the final plan. 
> But for other use cases, mostly when there's a join, the simple column name 
> comparison is not enough. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to