[jira] [Commented] (IGNITE-16430) Calcite engine. Sorted index spool with sorting can't be planned

Aleksey Plekhanov (Jira) Mon, 31 Jan 2022 05:11:06 -0800


    [ 
https://issues.apache.org/jira/browse/IGNITE-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17484663#comment-17484663
 ]


Aleksey Plekhanov commented on IGNITE-16430:
--------------------------------------------

[~tledkov-gridgain], unfortunately, it will not work. I've tried to create a 
new metadata handler, which computes rewind cost and use this cost in 
correlated nested loop cumulative cost calculation, but there is a problem with 
{{{}RelSubset{}}}. Subsets store only one best cost and only one rel operator 
for this cost, but there can be different operators in one subset some of them 
have better cumulative cost, some of them better rewind cost and total best 
cost depends on the count of rewinds, so we can't store these best operators in 
only one field. For example, if we have in one subset ({{{}filter + table 
spool){}}} as one branch with cumulative cost {{cc1}} and rewind cost {{rc1}} 
and {{(index spool + sort)}} as another branch with cumulative cost {{cc2}} and 
rewind cost {{{}rc2{}}}, such as {{cc1 < cc2}} and {{{}rc1 > rc2{}}}, then the 
first branch should be stored as best when we are not in the right branch of 
the correlated nested loop, otherwise second branch should be stored as best.

> Calcite engine. Sorted index spool with sorting can't be planned
> ----------------------------------------------------------------
>
>                 Key: IGNITE-16430
>                 URL: https://issues.apache.org/jira/browse/IGNITE-16430
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Aleksey Plekhanov
>            Priority: Major
>              Labels: calcite2-required, calcite3-required
>
> Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that 
> creates a sorted spool even if the input collation is empty. In this case, 
> collation is created by index condition and the new sort node is inserted 
> before the spool. But such a plan can never be chosen as the best plan since 
> when we calculate the cumulative cost for the nested loop correlated join, we 
> multiply left side rows count to right side commutative cost not taking into 
> account rewind cost. Currently, cumulative cost for filter + spool = {{{}CPU: 
> n + n{}}}, memory: {{{}0 + n{}}}, for sorted spool + sort = CPU: {{{}log n + 
> n*log n{}}}, memory: {{{}n + n{}}}. So, the cost for filter + spool will 
> always be better than the cost for sorted spool + sort and sorted spool + 
> sort never can be chosen. But for example, for sorted spool with sort rewind 
> CPU cost is only {{log n}} since sorting is required only once and rewind CPU 
> cost of filter + spool is {{{}n + n{}}}. So, starting from some iteration 
> count cost of {{iterations * rewind cost + cumulative cost}} will be better 
> than {{{}iterantions * cumulative cost{}}}, and sorted spool + sort will be 
> chosen in this case.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (IGNITE-16430) Calcite engine. Sorted index spool with sorting can't be planned

Reply via email to