[ 
https://issues.apache.org/jira/browse/CALCITE-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715881#comment-14715881
 ] 

Sean Hsuan-Yi Chu commented on CALCITE-844:
-------------------------------------------

[~julianhyde] Before you start reviewing process, could you first take a look 
at the comment here.
Essentially, we could have two ways to add that LogicalProject below 
LogicalWindow:
(1). In CalcRelSplitter: 
As Calcite tries to split expression, it "might" be possible to add a level at 
the bottom to create the project. However, there are two issues with this 
approach. Firstly, the splitter has the objective of using the least levels to 
split expressions. Even though adding this LogicalProject helps the performance 
overall, this operation seems contradicting the objective of this rule (Sounds 
doing too many things in a single rule). Secondly, it is very much difficult to 
go with this approach without major modification in the code. (At least, I 
still cannot find a proper place in the logic to do this thing.)

(2). Add a new rule (as implemented in the pull request):
Based on the points above, I think it is better to go with this route.

> The lack of Project under LogicalWindow hurts the performance
> -------------------------------------------------------------
>
>                 Key: CALCITE-844
>                 URL: https://issues.apache.org/jira/browse/CALCITE-844
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Sean Hsuan-Yi Chu
>            Assignee: Julian Hyde
>
> Firstly of all, this issue happens when HepPlanner is used with 
> ProjectToWindowRule.PROJECT rule.
> A simple query like:
> {code}
> select sum(deptno) over(partition by deptno) as sum1 
> from emp
> {code}
> produces
> {code}
> LogicalProject($0=[$9])
>   LogicalWindow(window#0=[window(partition {7} order by [] range between 
> UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($7)])])
>     LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> However, from performance standpoint, it is better to have a project between 
> LogicalWindow and LogicalTableScan since only one column is used. 
> Interestingly, when there is an expression in the window function. For 
> example, 
> {code}
> select sum(deptno + 1) over(partition by deptno) as sum1 
> from emp"
> {code}
> produces
> {code}
> LogicalProject($0=[$2])
>   LogicalWindow(window#0=[window(partition {0} order by [] range between 
> UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])])
>     LogicalProject(DEPTNO=[$7], $1=[+($7, 1)])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> The LogicalProject below window can trim out useless columns or even be 
> pushed into Scan, which is very important optimization Calcite can exploit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to