[ 
https://issues.apache.org/jira/browse/DRILL-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883995#comment-15883995
 ] 

ASF GitHub Bot commented on DRILL-5284:
---------------------------------------

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/761#discussion_r103067805
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/ExternalSortBatch.java
 ---
    @@ -392,22 +448,31 @@ private void configure(DrillConfig config) {
         // Set too large and the ratio between memory and input data sizes 
becomes
         // small. Set too small and disk seek times dominate performance.
     
    -    spillBatchSize = 
config.getBytes(ExecConstants.EXTERNAL_SORT_SPILL_BATCH_SIZE);
    -    spillBatchSize = Math.max(spillBatchSize, MIN_SPILL_BATCH_SIZE);
    +    preferredSpillBatchSize = 
config.getBytes(ExecConstants.EXTERNAL_SORT_SPILL_BATCH_SIZE);
    +
    +    // In low memory, use no more than 1/4 of memory for each spill batch. 
Ensures we
    +    // can merge.
    +
    +    preferredSpillBatchSize = Math.min(preferredSpillBatchSize, 
memoryLimit / 4);
    --- End diff --
    
    Why restrict the spill batch size so low ? This would create more runs and 
increase the risk of needing those intermediate merges.  Otherwise during a 
merge, only a single batch at a time is read from each run, not the whole run 
(I believe -- if we spill all the remaining batches at the end ...)



> Roll-up of final fixes for managed sort
> ---------------------------------------
>
>                 Key: DRILL-5284
>                 URL: https://issues.apache.org/jira/browse/DRILL-5284
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.10.0
>
>
> The managed external sort was introduced in DRILL-5080. Since that time, 
> extensive testing has identified a number of minor fixes and improvements. 
> Given the long PR cycles, it is not practical to spend a week or two to do a 
> PR for each fix individually. This ticket represents a roll-up of a 
> combination of a number of fixes. Small fixes are listed here, larger items 
> appear as sub-tasks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to