[jira] [Comment Edited] (CALCITE-6038) Add optimization to remove redundant TopN when its input's row number is less or equal to one

LakeShen (Jira) Fri, 06 Oct 2023 13:24:04 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772605#comment-17772605
 ]


LakeShen edited comment on CALCITE-6038 at 10/6/23 8:23 PM:
------------------------------------------------------------

Hi [~julianhyde] ,thanks for your reply.

Sorry,my description is not very clear,I have changed the description.

For the `select count(*) from t limit 0` case, I think that uses 
`PruneEmptyRules#SORT_FETCH_ZERO_INSTANCE`could be better choice,and this rule 
optimizes the plan earlier than TOPN's optimization,so handling empty 
situations should be left to `PruneEmptyRules`.

In this jira,if the TOPN's fetch greater than 0,I would compare the maximum row 
count of the TOPN's input,  and if it is at most one,we could remove the 
redundant TOPN.

In Trino or Presto,this rule is called 
[RemoveRedundantTopN.|https://github.com/prestodb/presto/blob/21ab1ea2425e4bc65532ab156c60333e5a72dd09/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/RemoveRedundantTopN.java#L27C1-L28C34]And
 they also use `EvaluateZeroLimit` rule to deal with limit 0 
situations,`EvaluateZeroLimit` appears earlier than `RemoveRedundantTopN,` 
which means `EvaluateZeroLimit` will optimize the plan earlier, so in actuality 
there will be no Limit 0 scenario in `RemoveRedundantTopN`

So my idea is to extend this optimization in Calcite,and this JIRA is only for 
TOPN.

WDYT?


was (Author: shenlang):
Hi [~julianhyde] ,thanks for your reply.

Sorry,my description is not very clear,I have changed the description.

For the `select count(*) from t limit 0` case, I think that uses 
`PruneEmptyRules#SORT_FETCH_ZERO_INSTANCE`could be better choice,and this rule 
optimizes the plan earlier than TOPN's optimization,so handling empty 
situations should be left to `PruneEmptyRules`.

In this jira,I would compare the maximum row count of the TOPN's input, and if 
it is at most one,we could remove the redundant TOPN.In Trino or Presto,this 
rule is called 
[RemoveRedundantTopN.|https://github.com/prestodb/presto/blob/21ab1ea2425e4bc65532ab156c60333e5a72dd09/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/RemoveRedundantTopN.java#L27C1-L28C34]And
 they also use `EvaluateZeroLimit` rule to deal with limit 0 
situations,`EvaluateZeroLimit` appears earlier than `RemoveRedundantTopN,` 
which means `EvaluateZeroLimit` will optimize the plan earlier, so in actuality 
there will be no Limit 0 scenario in `RemoveRedundantTopN`


So my idea is to extend this optimization in Calcite,and this JIRA is only for 
TOPN.

WDYT?

> Add optimization to remove redundant TopN when its input's row number is less 
> or equal to one
> ---------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-6038
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6038
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: LakeShen
>            Priority: Major
>
> In Calcite , the TopN is represented by `Sort`,when a TopN's input source max 
> row count is less or equal to 1,then we could remove the redundant TopN.
> For example,the sql:
> {code:java}
> SELECT count(*) FROM orders ORDER BY 1 LIMIT 10 {code}
> because the `SELECT count(*) FROM orders ` row count is 1, then we could 
> remove `ORDER BY 1 LIMIT 10 `,after the optimization:
> {code:java}
> SELECT count(*) FROM orders  {code}
> Above logic are same as Presto/Trino's 
> [RemoveRedundantTopN|https://github.com/prestodb/presto/blob/21ab1ea2425e4bc65532ab156c60333e5a72dd09/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/RemoveRedundantTopN.java#L27C1-L28C34]
>  rule:
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (CALCITE-6038) Add optimization to remove redundant TopN when its input's row number is less or equal to one

Reply via email to