Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/767#discussion_r104275259
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/ExternalSortBatch.java
 ---
    @@ -1333,8 +1339,43 @@ private void spillFromMemory() {
         mergeAndSpill(bufferedBatches, spillCount);
       }
     
    +  private void mergeRuns(int targetCount) {
    +
    +    // Determine the number of runs to merge. The count should be the
    +    // target count. However, to prevent possible memory overrun, we
    +    // double-check with actual spill batch size and only spill as much
    +    // as fits in the merge memory pool.
    +
    +    int mergeCount = 0;
    +    long mergeSize = 0;
    +    for (SpilledRun batch : spilledRuns) {
    +      long batchSize = batch.getBatchSize();
    +      if (mergeSize + batchSize > mergeMemoryPool) {
    +        break;
    +      }
    +      mergeSize += batchSize;
    +      mergeCount++;
    +      if (mergeCount == targetCount) {
    +        break;
    +      }
    +    }
    +
    +    // Must always spill at least 2, even if this creates an over-size
    +    // spill file. But, if this is a final consolidation, we may have only
    +    // a single batch.
    +
    +    mergeCount = Math.max(mergeCount, 2);
    +    mergeCount = Math.min(mergeCount, spilledRuns.size());
    +
    +    // Do the actual spill.
    +
    +    mergeAndSpill(spilledRuns, mergeCount);
    --- End diff --
    
    The logic is a bit more complex. We use the size of the largest row as our 
row size estimate. But, when we spill, we also consider the actual batch sizes. 
Earlier versions of the code were more sensitive to the row size estimate, more 
recent revisions have evolved to favor actual sizes in many cases.
    
    For "first generation" batches, we use the actual "payload" amount of data 
in each batch, sum that amount, and stop when we have enough data to fill a 
spill file. Thus, first generation spills are immune to the problem you 
describe.
    
    For merge/spills, we guess a target number of batches based on the target 
spill batch size. If the above logic works, then we know (roughly) the size of 
the spilled batches -- because we created them. As a sanity check, new code 
above sums the actual batch sizes in the files to merge to ensure that they fit 
in memory, choosing a smaller-than-requested number of batches if the merge 
batches turn out to be a bit larger than expected.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to