[ 
https://issues.apache.org/jira/browse/CALCITE-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310420#comment-17310420
 ] 

Vladimir Sitnikov commented on CALCITE-4558:
--------------------------------------------

Pointer to row has fixed size no matter what is the number of columns. In other 
words, pointer swap has a fixed per-row cost, and it does not depend on the row 
width/field count.

Good sorting algorithm should not access unused fields, so I see no reason why 
do you always keep saying "proportional to the number of fields".

The cost of on-disk sort is different since it might require multiple passes 
with store-load of the data in-between.
The key factor for costing on-disk sort is disk performance and the amount of 
memory the algorithm can use for the intermediate steps.

I assume that the systems that perform on-disk sort would override the cost 
function. If you want to support on-disk costing in Sort, please file a 
separate JIRA for that.

For now I want fix Sort cost so it represents in-memory sort.

> Sort CPU cost should not incur per-field copy cost for alignment with filter 
> and project
> ----------------------------------------------------------------------------------------
>
>                 Key: CALCITE-4558
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4558
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.26.0
>            Reporter: Vladimir Sitnikov
>            Priority: Major
>
> Typical Java implementations of the sort do not copy rows (they copy 
> references only), so 
> it makes little sense to have "row width" as the key driver of the sort 
> costing.
> The CPU cost for filter does not include "row copy" cost.
> Even though the implementations might be different, in-core costs should be 
> aligned.
> For instance, the current, EnumerableLimitSort and EnumerableSort have 
> bytesPerRow multiplier, however, the implementation does not copy rows 
> field-by-field .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to