[
https://issues.apache.org/jira/browse/CALCITE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757513#comment-17757513
]
LakeShen edited comment on CALCITE-5940 at 8/22/23 2:48 PM:
------------------------------------------------------------
Hi [~julianhyde] ,thanks for your advice.I think SortMerge is trying to do a
bit too much.
For example,in presto or trino,the limit,sort,topn,offset have their own
Node,the mapping is as follows
1. limit -> LimitNode
2. sort -> SortNode
3. topn -> TopNNode
4. offset -> OffsetNode
For the different nodes above, there will be corresponding rules to
optimize,such as MergeLimits,RemoveRedundantSort,RemoveRedundantTopN and so on.
But in Calcite, the above four types of Node are represented by LogicalSort,so
there are a lot of things to do in SortMergeRule,this requires a number of
cases to be considered at SortMergeRule.
Could we split the SortMergeRule into smaller optimization rules to implement?
From an implementation point of view, it might be easier.
was (Author: shenlang):
Hi [~julianhyde] ,thanks for you advice.I think SortMerge is trying to do a bit
too much.
For example,in presto or trino,the limit,sort,topn,offset have their own
Node,the mapping is as follows
1. limit -> LimitNode
2. sort -> SortNode
3. topn -> TopNNode
4. offset -> OffsetNode
For the different nodes above, there will be corresponding rules to
optimize,such as MergeLimits,RemoveRedundantSort,RemoveRedundantTopN and so on.
But in Calcite, the above four types of Node are represented by LogicalSort,so
there are a lot of things to do in SortMergeRule,this requires a number of
cases to be considered at SortMergeRule.
Could we split the SortMergeRule into smaller optimization rules to implement?
From an implementation point of view, it might be easier.
> Add the Rules to optimize Limit
> -------------------------------
>
> Key: CALCITE-5940
> URL: https://issues.apache.org/jira/browse/CALCITE-5940
> Project: Calcite
> Issue Type: New Feature
> Reporter: LakeShen
> Priority: Major
>
> Now in calcite,the Limit will be represented using
> LogicalSort(fetch=[xx]),but there are few rules to optimize Limit.
> In trino and presto,there are many optimization rules to optimize Limit.
> For example,the sql:
> {code:java}
> select * from nation limit 0 {code}
> The limit 0 will use empty ValuesNode(Calcite LogicalValues) to optimize,so
> the SQL is not delivered to the Worker compute,the rule could see:
> [EvaluateZeroLimit|https://github.com/prestodb/presto/blob/fea80c96ddfe4dc42f79c3cff9294b88595275ce/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/EvaluateZeroLimit.java#L28C1-L28C31]
> The sql:
> {code:java}
> select concat('-',N_REGIONKEY) from (select * from nation limit 10000) limit
> 10 {code}
> It would be optimized by
> [MergeLimits|https://github.com/prestodb/presto/blob/fea80c96ddfe4dc42f79c3cff9294b88595275ce/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/MergeLimits.java#L26]
> rule to:
> {code:java}
> select concat('-',N_REGIONKEY) from nation limit 10 {code}
> The value of limit takes the minimum of the outer limit and the inner limit.
> The sql:
> {code:java}
> select concat('-',N_REGIONKEY) from (SELECT * FROM nation order BY
> N_REGIONKEY DESC LIMIT 10000) limit 10 {code}
> It would be optimized by
> [MergeLimitWithTopN|https://github.com/prestodb/presto/blob/fea80c96ddfe4dc42f79c3cff9294b88595275ce/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/MergeLimitWithTopN.java#L28C1-L28C31]
> rule to:
> {code:java}
> SELECT concat('-',N_REGIONKEY) FROM nation order BY N_REGIONKEY DESC LIMIT
> 10{code}
> So I propose to add these three rules to Calcite as well, to optimize the
> Limit case.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)