[
https://issues.apache.org/jira/browse/CALCITE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756929#comment-16756929
]
Hongze Zhang edited comment on CALCITE-2798 at 1/31/19 7:37 AM:
----------------------------------------------------------------
Thanks [~julianhyde], I am with you that the LogicalSort cannot be removed
unreasonably.
My imagination was to remove the collation guarantee for logical nodes except
for LogicalSort, such as the LogicalProject and LogicalFilter and others (Sorry
maybe a little far from this topic), to make Calcite generate the initial
logical plan with "smallest collation requirement". For instance, if we use
Calcite to build a Sort-Sort-Project-Sort relation, the initial plan registered
to volcano planner will be:
{quote}LogicalSort[0] – input subset collation:[0]
LogicalSort[0] – input subset collation:[0]
LogicalProject[0] – input subset collation:[0]
LogicalSort[0] – input subset collation:[]
{quote}
If we focus on a concept like "smallest collation requirement", maybe we can
generate and register following rels instead:
{quote}LogicalSort[0] – input subset collation:[]
LogicalSort[0] – input subset collation:[]
LogicalProject[] – input subset collation:[]
LogicalSort[0] – input subset collation:[]
{quote}
That is, there will be no rel node (on LOGICAL level) that really needs sorted
input, and only the LogicalSort will guarante sorted output.
Do we lose optimizations if we were to do something like this?
was (Author: zhztheplayer):
Thanks [~julianhyde], I am with you that the LogicalSort cannot be removed
unreasonably.
My imagination was to remove the collation guarantee for logical nodes except
for LogicalSort, such as the LogicalProject and LogicalFilter and others,
(Maybe a little far from this topic), to make Calcite generate the initial
logical plan with "smallest collation requirement". For instance, if we use
calcite to build a Sort-Sort-Project-Sort relation, the initial plan registered
to volcano planner will be:
{quote}LogicalSort[0] – input subset collation:[0]
LogicalSort[0] – input subset collation:[0]
LogicalProject[0] – input subset collation:[0]
LogicalSort[0] – input subset collation:[]
{quote}
If we focus on a concept like "smallest collation requirement", maybe we can
generate and register following rels instead:
{quote}LogicalSort[0] – input subset collation:[]
LogicalSort[0] – input subset collation:[]
LogicalProject[] – input subset collation:[]
LogicalSort[0] – input subset collation:[]
{quote}
That is, there will be no rel node (on LOGICAL level) that really needs sorted
input, and only the LogicalSort will guarante sorted output.
Do we lose optimizations if we were to do something like this?
> Optimizer should remove ORDER BY in sub-query, provided it has no LIMIT or
> OFFSET
> ---------------------------------------------------------------------------------
>
> Key: CALCITE-2798
> URL: https://issues.apache.org/jira/browse/CALCITE-2798
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.18.0
> Reporter: Vladimir Sitnikov
> Assignee: Julian Hyde
> Priority: Major
>
> The following SQL performs sort twice, however inner sort can be eliminated
> {code}select * from (
> select * from "emps"
> order by "emps"."deptno"
> ) order by 1 desc{code}
> The same goes for (window calculation will sort on its own)
> {code}select row_number() over (order by "emps"."deptno") from (
> select * from "emps"
> order by "emps"."deptno" desc
> ){code}
> The same goes for SetOp (union, minus):
> {code}select * from (
> select * from "emps"
> order by "emps"."deptno"
> ) union select * from (
> select * from "emps"
> order by "emps"."deptno" desc
> ){code}
> There might be other cases like that (e.g. Aggregate, Join, Exchange,
> SortExchange)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)