Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/761#discussion_r103334294
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/ExternalSortBatch.java
 ---
    @@ -948,50 +1027,50 @@ private void updateMemoryEstimates(long memoryDelta, 
RecordBatchSizer sizer) {
         // spill batches of either 64K records, or as many records as fit into 
the
         // amount of memory dedicated to each spill batch, whichever is less.
     
    -    spillBatchRowCount = (int) Math.max(1, spillBatchSize / 
estimatedRowWidth);
    +    spillBatchRowCount = (int) Math.max(1, preferredSpillBatchSize / 
estimatedRowWidth / 2);
         spillBatchRowCount = Math.min(spillBatchRowCount, Character.MAX_VALUE);
     
    +    // Compute the actual spill batch size which may be larger or smaller
    +    // than the preferred size depending on the row width. Double the 
estimated
    +    // memory needs to allow for power-of-two rounding.
    +
    +    targetSpillBatchSize = spillBatchRowCount * estimatedRowWidth * 2;
    +
         // Determine the number of records per batch per merge step. The goal 
is to
         // merge batches of either 64K records, or as many records as fit into 
the
         // amount of memory dedicated to each merge batch, whichever is less.
     
    -    targetMergeBatchSize = preferredMergeBatchSize;
    -    mergeBatchRowCount = (int) Math.max(1, targetMergeBatchSize / 
estimatedRowWidth);
    +    mergeBatchRowCount = (int) Math.max(1, preferredMergeBatchSize / 
estimatedRowWidth / 2);
         mergeBatchRowCount = Math.min(mergeBatchRowCount, Character.MAX_VALUE);
    +    targetMergeBatchSize = mergeBatchRowCount * estimatedRowWidth * 2;
     
         // Determine the minimum memory needed for spilling. Spilling is done 
just
         // before accepting a batch, so we must spill if we don't have room 
for a
         // (worst case) input batch. To spill, we need room for the output 
batch created
         // by merging the batches already in memory. Double this to allow for 
power-of-two
         // memory allocations.
     
    -    spillPoint = estimatedInputBatchSize + 2 * spillBatchSize;
    +    long spillPoint = estimatedInputBatchSize + 2 * targetSpillBatchSize;
     
         // The merge memory pool assumes we can spill all input batches. To 
make
         // progress, we must have at least two merge batches (same size as an 
output
         // batch) and one output batch. Again, double to allow for power-of-two
         // allocation and add one for a margin of error.
     
    -    int minMergeBatches = 2 * 3 + 1;
    -    long minMergeMemory = minMergeBatches * targetMergeBatchSize;
    +    long minMergeMemory = Math.round((2 * targetSpillBatchSize + 
targetMergeBatchSize) * 1.05);
     
         // If we are in a low-memory condition, then we might not have room 
for the
         // default output batch size. In that case, pick a smaller size.
     
    -    long minMemory = Math.max(spillPoint, minMergeMemory);
    -    if (minMemory > memoryLimit) {
    -
    -      // Figure out the minimum output batch size based on memory, but 
can't be
    -      // any smaller than the defined minimum.
    -
    -      targetMergeBatchSize = Math.max(MIN_MERGED_BATCH_SIZE, memoryLimit / 
minMergeBatches);
    +    if (minMergeMemory > memoryLimit) {
     
    -      // Regardless of anything else, the batch must hold at least one
    -      // complete row.
    +      // Figure out the minimum output batch size based on memory,
    +      // must hold at least one complete row.
     
    -      targetMergeBatchSize = Math.max(estimatedRowWidth, 
targetMergeBatchSize);
    -      spillPoint = estimatedInputBatchSize + 2 * spillBatchSize;
    -      minMergeMemory = minMergeBatches * targetMergeBatchSize;
    +      long mergeAllowance = Math.round((memoryLimit - 2 * 
targetSpillBatchSize) * 0.95);
    +      targetMergeBatchSize = Math.max(estimatedRowWidth, mergeAllowance / 
2);
    +      mergeBatchRowCount = (int) (targetMergeBatchSize / estimatedRowWidth 
/ 2);
    --- End diff --
    
    Good catch! Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to