[ 
https://issues.apache.org/jira/browse/CALCITE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757513#comment-17757513
 ] 

LakeShen commented on CALCITE-5940:
-----------------------------------

Hi [~julianhyde] ,thanks for you advice.I think SortMerge is trying to do a bit 
too much.

For example,in presto or trino,the limit,sort,topn,offset have their own 
Node,the mapping is as follows
1. limit -> LimitNode
2. sort -> SortNode
3. topn -> TopNNode
4. offset -> OffsetNode

For the different nodes above, there will be corresponding rules to 
optimize,such as MergeLimits,RemoveRedundantSort,RemoveRedundantTopN and so on.


But in Calcite, the above four types of Node are represented by LogicalSort,so 
there are a lot of things to do in SortMergeRule,this requires a number of 
cases to be considered at SortMergeRule.


Could we split the SortMergeRule into smaller optimization rules to implement? 
From an implementation point of view, it might be easier.

> Add the Rules to optimize Limit
> -------------------------------
>
>                 Key: CALCITE-5940
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5940
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: LakeShen
>            Priority: Major
>
> Now in calcite,the Limit will be represented using 
> LogicalSort(fetch=[xx]),but there are few rules to optimize Limit.
> In trino and presto,there are many optimization rules to optimize Limit.
> For example,the sql:
> {code:java}
> select * from nation limit 0 {code}
> The limit 0 will use empty ValuesNode(Calcite LogicalValues) to optimize,so 
> the SQL is not delivered to the Worker compute,the rule could see: 
> [EvaluateZeroLimit|https://github.com/prestodb/presto/blob/fea80c96ddfe4dc42f79c3cff9294b88595275ce/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/EvaluateZeroLimit.java#L28C1-L28C31]
> The sql:
> {code:java}
> select concat('-',N_REGIONKEY) from (select * from nation limit 10000) limit 
> 10 {code}
> It would be optimized by 
> [MergeLimits|https://github.com/prestodb/presto/blob/fea80c96ddfe4dc42f79c3cff9294b88595275ce/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/MergeLimits.java#L26]
>  rule to:
> {code:java}
> select concat('-',N_REGIONKEY) from nation limit 10  {code}
> The value of limit takes the minimum of the outer limit and the inner limit.
> The sql:
> {code:java}
> select concat('-',N_REGIONKEY) from (SELECT * FROM nation order BY 
> N_REGIONKEY DESC LIMIT 10000) limit 10 {code}
> It would be optimized by 
> [MergeLimitWithTopN|https://github.com/prestodb/presto/blob/fea80c96ddfe4dc42f79c3cff9294b88595275ce/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/MergeLimitWithTopN.java#L28C1-L28C31]
>  rule to:
> {code:java}
> SELECT concat('-',N_REGIONKEY) FROM nation order BY N_REGIONKEY DESC LIMIT 
> 10{code}
> So I propose to add these three rules to Calcite as well, to optimize the 
> Limit case.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to