[
https://issues.apache.org/jira/browse/DRILL-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297509#comment-16297509
]
ASF GitHub Bot commented on DRILL-6030:
---------------------------------------
Github user vrozov commented on the issue:
https://github.com/apache/drill/pull/1075
The scenario when all batches can be merged in memory is covered by 'if
(canUseMemoryMerge())` check in `SortImpl.java:399`. The affected code path
applies only to cases where merge between spilled and in-memory batches is
necessary. Note that this is a short term fix to improve managed sort
performance, in a long run, it is necessary to have an ability to merge all
batches in memory (using SV4) without spilling and be able to merge it with the
spilled data.
> Managed sort should minimize number of batches in a k-way merge
> ---------------------------------------------------------------
>
> Key: DRILL-6030
> URL: https://issues.apache.org/jira/browse/DRILL-6030
> Project: Apache Drill
> Issue Type: Improvement
> Reporter: Vlad Rozov
> Assignee: Vlad Rozov
>
> The time complexity of the algorithm is O(n*k*log(k)) where k is a number of
> batches to merge and n is a number of records in each batch (assuming equal
> size batches). As n*k is the total number of record to merge and it can be
> quite large, minimizing k should give better results.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)