[ 
https://issues.apache.org/jira/browse/DRILL-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297509#comment-16297509
 ] 

ASF GitHub Bot commented on DRILL-6030:
---------------------------------------

Github user vrozov commented on the issue:

    https://github.com/apache/drill/pull/1075
  
    The scenario when all batches can be merged in memory is covered by 'if 
(canUseMemoryMerge())` check in `SortImpl.java:399`. The affected code path 
applies only to cases where merge between spilled and in-memory batches is 
necessary. Note that this is a short term fix to improve managed sort 
performance, in a long run, it is necessary to have an ability to merge all 
batches in memory (using SV4) without spilling and be able to merge it with the 
spilled data.


> Managed sort should minimize number of batches in a k-way merge
> ---------------------------------------------------------------
>
>                 Key: DRILL-6030
>                 URL: https://issues.apache.org/jira/browse/DRILL-6030
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Vlad Rozov
>            Assignee: Vlad Rozov
>
> The time complexity of the algorithm is O(n*k*log(k)) where k is a number of 
> batches to merge and n is a number of records in each batch (assuming equal 
> size batches). As n*k is the total number of record to merge and it can be 
> quite large, minimizing k should give better results.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to