[ 
https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated CALCITE-2624:
-----------------------------------------
    Description: 
Currently, the only rule that allows a sort to traverse a binary operator is 
the SortJoinTransposeRule. The rule was introduced mainly to push limits in the 
case of left and right outer joins (see CALCITE-831).

I assume that the main reason that we don't have more rules is that sorts with 
limits and offsets cannot be pushed safely below many types of join operators. 
However, in many cases, it is possible and beneficial for optimization purposes 
to just push the sort without the limit and offset. Since we do not know in 
advance if the join operator preserves the order we cannot remove (that is why 
I am saying copy and not transpose) the sort operator on top of the join. The 
latter is not really a problem since the SortRemoveRule can detect such cases 
and remove the sort if it is redundant.

A few concrete examples where this optimization makes sense are outlined below:
 * allow the sort to be later absorbed by an index scan and disappear from the 
plan (Sort + Tablescan => IndexScan with RelCollation);
 * allow operators that require sorted inputs to be exploited more easily 
(e.g., merge join);
 * allow the sort to be performed on a possibly smaller result (assuming that 
the physical binary operator that is going to be used preserves the order of 
left/right input and the top sort operator can be removed entirely).

I propose to add a new rule (e.g., SortCopyBelowJoinRule, 
SortJoinCopyBelowRule) which allows a sort to be copied to the left or right 
(or to both if it is rather easy to decompose the sort) of a join operator 
(excluding the limit and offset attributes) if the respective inputs are not 
already sorted. 

  was:
Currently, the only rule that allows a sort to traverse a binary operator is 
the SortJoinTransposeRule. The rule was introduced mainly to push limits in the 
case of left and right outer joins (see CALCITE-831).

I assume that the main reason that we don't have more rules is that sorts with 
limits and offsets cannot be pushed safely below many binary operators. 
However, in many cases, it is possible and beneficial for optimization purposes 
to just push the sort (without the limit and offset) below the binary operator. 
Since we do not know in advance if the binary operator preserves the sort we 
cannot remove (that is why I am saying copy and not transpose) the sort 
operator on top of the binary operator. The latter is not really a problem 
since the SortRemoveRule can detect such cases and remove the sort if it is 
redundant.

A few concrete examples where this optimization makes sense are outlined below:
 * allow the sort to be later absorbed by an index scan and disappear from the 
plan (Sort + Tablescan => IndexScan with RelCollation);
 * allow operators that require sorted inputs to be exploited more easily 
(e.g., merge join);
 * allow the sort to be performed on a possibly smaller result (assuming that 
the physical binary operator that is going to be used preserves the order of 
left/right input and the top sort operator can be removed entirely).

I propose to add a new rule (e.g., SortCopyBelowBiRelRule, 
SortBiRelCopyBelowRule, SortBiRelTransposeRule) which allows a sort to be 
copied to the left or right of a BiRel operator (excluding the limit and offset 
attributes) if the respective input is not already sorted. For my use case, it 
would suffice to apply the rule only on joins and correlates but I don't see 
why not making it a bit more general from the beginning.


> Add a rule to copy a sort below a join operator
> -----------------------------------------------
>
>                 Key: CALCITE-2624
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2624
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.17.0
>            Reporter: Stamatis Zampetakis
>            Priority: Major
>
> Currently, the only rule that allows a sort to traverse a binary operator is 
> the SortJoinTransposeRule. The rule was introduced mainly to push limits in 
> the case of left and right outer joins (see CALCITE-831).
> I assume that the main reason that we don't have more rules is that sorts 
> with limits and offsets cannot be pushed safely below many types of join 
> operators. However, in many cases, it is possible and beneficial for 
> optimization purposes to just push the sort without the limit and offset. 
> Since we do not know in advance if the join operator preserves the order we 
> cannot remove (that is why I am saying copy and not transpose) the sort 
> operator on top of the join. The latter is not really a problem since the 
> SortRemoveRule can detect such cases and remove the sort if it is redundant.
> A few concrete examples where this optimization makes sense are outlined 
> below:
>  * allow the sort to be later absorbed by an index scan and disappear from 
> the plan (Sort + Tablescan => IndexScan with RelCollation);
>  * allow operators that require sorted inputs to be exploited more easily 
> (e.g., merge join);
>  * allow the sort to be performed on a possibly smaller result (assuming that 
> the physical binary operator that is going to be used preserves the order of 
> left/right input and the top sort operator can be removed entirely).
> I propose to add a new rule (e.g., SortCopyBelowJoinRule, 
> SortJoinCopyBelowRule) which allows a sort to be copied to the left or right 
> (or to both if it is rather easy to decompose the sort) of a join operator 
> (excluding the limit and offset attributes) if the respective inputs are not 
> already sorted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to