Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/767#discussion_r104275259
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/ExternalSortBatch.java
---
@@ -1333,8 +1339,43 @@ private void spillFromMemory() {
mergeAndSpill(bufferedBatches, spillCount);
}
+ private void mergeRuns(int targetCount) {
+
+ // Determine the number of runs to merge. The count should be the
+ // target count. However, to prevent possible memory overrun, we
+ // double-check with actual spill batch size and only spill as much
+ // as fits in the merge memory pool.
+
+ int mergeCount = 0;
+ long mergeSize = 0;
+ for (SpilledRun batch : spilledRuns) {
+ long batchSize = batch.getBatchSize();
+ if (mergeSize + batchSize > mergeMemoryPool) {
+ break;
+ }
+ mergeSize += batchSize;
+ mergeCount++;
+ if (mergeCount == targetCount) {
+ break;
+ }
+ }
+
+ // Must always spill at least 2, even if this creates an over-size
+ // spill file. But, if this is a final consolidation, we may have only
+ // a single batch.
+
+ mergeCount = Math.max(mergeCount, 2);
+ mergeCount = Math.min(mergeCount, spilledRuns.size());
+
+ // Do the actual spill.
+
+ mergeAndSpill(spilledRuns, mergeCount);
--- End diff --
The logic is a bit more complex. We use the size of the largest row as our
row size estimate. But, when we spill, we also consider the actual batch sizes.
Earlier versions of the code were more sensitive to the row size estimate, more
recent revisions have evolved to favor actual sizes in many cases.
For "first generation" batches, we use the actual "payload" amount of data
in each batch, sum that amount, and stop when we have enough data to fill a
spill file. Thus, first generation spills are immune to the problem you
describe.
For merge/spills, we guess a target number of batches based on the target
spill batch size. If the above logic works, then we know (roughly) the size of
the spilled batches -- because we created them. As a sanity check, new code
above sums the actual batch sizes in the files to merge to ensure that they fit
in memory, choosing a smaller-than-requested number of batches if the merge
batches turn out to be a bit larger than expected.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---