[jira] [Commented] (CALCITE-5940) Add the Rules to optimize Limit

Julian Hyde (Jira) Mon, 21 Aug 2023 00:37:05 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756706#comment-17756706
 ]


Julian Hyde commented on CALCITE-5940:
--------------------------------------

You need a rule to merge limits. Since limits are done by the {{Sort}} 
operator, the rule should be called {{SortMergeRule}}. I believe that such a 
rule will naturally be able to avoid both redundant sorts and redundant limits.

In the cases you have mentioned above, both limits have an empty sort key. But 
empty sort keys are a special case, and we can generalize. I believe that one 
limit can subsume the other limit if its sort key is a prefix of the other sort 
key.  Thus "select * from (select * from emp order by deptno limit 10) order by 
deptno, job limit 100" can be optimized to "limit 10", but "select * from 
(select * from emp order by deptno, job limit 10) order by deptno limit 100" 
cannot. But please check my logic.

A rule to push a limit through a Project will also be useful. I believe 
{{SortProjectTransposeRule}} does this.

> Add the Rules to optimize Limit
> -------------------------------
>
>                 Key: CALCITE-5940
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5940
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: LakeShen
>            Priority: Major
>             Fix For: 1.36.0
>
>
> Now in calcite,the Limit will be represented using 
> LogicalSort(fetch=[xx]),but there are few rules to optimize Limit.
> In trino and presto,there are many optimization rules to optimize Limit.
> For example,the sql:
> {code:java}
> select * from nation limit 0 {code}
> The limit 0 will use empty ValuesNode(Calcite LogicalValues) to optimize,so 
> the SQL is not delivered to the Worker compute,the rule could see: 
> [EvaluateZeroLimit|https://github.com/prestodb/presto/blob/fea80c96ddfe4dc42f79c3cff9294b88595275ce/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/EvaluateZeroLimit.java#L28C1-L28C31]
> The sql:
> {code:java}
> select concat('-',N_REGIONKEY) from (select * from nation limit 10000) limit 
> 10 {code}
> It would be optimized by 
> [MergeLimits|https://github.com/prestodb/presto/blob/fea80c96ddfe4dc42f79c3cff9294b88595275ce/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/MergeLimits.java#L26]
>  rule to:
> {code:java}
> select concat('-',N_REGIONKEY) from nation limit 10  {code}
> The value of limit takes the minimum of the outer limit and the inner limit.
> The sql:
> {code:java}
> select concat('-',N_REGIONKEY) from (SELECT * FROM nation order BY 
> N_REGIONKEY DESC LIMIT 10000) limit 10 {code}
> It would be optimized by 
> [MergeLimitWithTopN|https://github.com/prestodb/presto/blob/fea80c96ddfe4dc42f79c3cff9294b88595275ce/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/MergeLimitWithTopN.java#L28C1-L28C31]
>  rule to:
> {code:java}
> SELECT concat('-',N_REGIONKEY) FROM nation order BY N_REGIONKEY DESC LIMIT 
> 10{code}
> So I propose to add these three rules to Calcite as well, to optimize the 
> Limit case.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5940) Add the Rules to optimize Limit

Reply via email to