[ 
https://issues.apache.org/jira/browse/CALCITE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756929#comment-16756929
 ] 

Hongze Zhang edited comment on CALCITE-2798 at 1/31/19 7:37 AM:
----------------------------------------------------------------

Thanks [~julianhyde], I am with you that the LogicalSort cannot be removed 
unreasonably.

My imagination was to remove the collation guarantee for logical nodes except 
for LogicalSort, such as the LogicalProject and LogicalFilter and others (Sorry 
maybe a little far from this topic), to make Calcite generate the initial 
logical plan with "smallest collation requirement". For instance, if we use 
Calcite to build a Sort-Sort-Project-Sort relation, the initial plan registered 
to volcano planner will be:
{quote}LogicalSort[0] – input subset collation:[0]
 LogicalSort[0] – input subset collation:[0]
 LogicalProject[0] – input subset collation:[0]
 LogicalSort[0] – input subset collation:[]
{quote}
If we focus on a concept like "smallest collation requirement", maybe we can 
generate and register following rels instead:
{quote}LogicalSort[0] – input subset collation:[]
 LogicalSort[0] – input subset collation:[]
 LogicalProject[] – input subset collation:[]
 LogicalSort[0] – input subset collation:[]
{quote}
That is, there will be no rel node (on LOGICAL level) that really needs sorted 
input, and only the LogicalSort will guarante sorted output.

Do we lose optimizations if we were to do something like this?


was (Author: zhztheplayer):
Thanks [~julianhyde], I am with you that the LogicalSort cannot be removed 
unreasonably.

My imagination was to remove the collation guarantee for logical nodes except 
for LogicalSort, such as the LogicalProject and LogicalFilter and others, 
(Maybe a little far from this topic), to make Calcite generate the initial 
logical plan with "smallest collation requirement". For instance, if we use 
calcite to build a Sort-Sort-Project-Sort relation, the initial plan registered 
to volcano planner will be:
{quote}LogicalSort[0] – input subset collation:[0]
 LogicalSort[0] – input subset collation:[0]
 LogicalProject[0] – input subset collation:[0]
 LogicalSort[0] – input subset collation:[]
{quote}
If we focus on a concept like "smallest collation requirement", maybe we can 
generate and register following rels instead:
{quote}LogicalSort[0] – input subset collation:[]
 LogicalSort[0] – input subset collation:[]
 LogicalProject[] – input subset collation:[]
 LogicalSort[0] – input subset collation:[]
{quote}
That is, there will be no rel node (on LOGICAL level) that really needs sorted 
input, and only the LogicalSort will guarante sorted output.

Do we lose optimizations if we were to do something like this?

> Optimizer should remove ORDER BY in sub-query, provided it has no LIMIT or 
> OFFSET
> ---------------------------------------------------------------------------------
>
>                 Key: CALCITE-2798
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2798
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.18.0
>            Reporter: Vladimir Sitnikov
>            Assignee: Julian Hyde
>            Priority: Major
>
> The following SQL performs sort twice, however inner sort can be eliminated
> {code}select * from (
>   select * from "emps" 
> order by "emps"."deptno"
> ) order by 1 desc{code}
> The same goes for (window calculation will sort on its own)
> {code}select row_number() over (order by "emps"."deptno")  from (
>   select * from "emps" 
> order by "emps"."deptno" desc
> ){code}
> The same goes for SetOp (union, minus):
> {code}select * from (
>   select * from "emps" 
> order by "emps"."deptno"
> ) union select * from (
>   select * from "emps" 
> order by "emps"."deptno" desc
> ){code}
> There might be other cases like that (e.g. Aggregate, Join, Exchange, 
> SortExchange)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to