Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/7948#discussion_r36272873
--- Diff:
core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java
---
@@ -191,24 +191,28 @@ public void spill() throws IOException {
spillWriters.size(),
spillWriters.size() > 1 ? " times" : " time");
- final UnsafeSorterSpillWriter spillWriter =
- new UnsafeSorterSpillWriter(blockManager, fileBufferSizeBytes,
writeMetrics,
- inMemSorter.numRecords());
- spillWriters.add(spillWriter);
- final UnsafeSorterIterator sortedRecords =
inMemSorter.getSortedIterator();
- while (sortedRecords.hasNext()) {
- sortedRecords.loadNext();
- final Object baseObject = sortedRecords.getBaseObject();
- final long baseOffset = sortedRecords.getBaseOffset();
- final int recordLength = sortedRecords.getRecordLength();
- spillWriter.write(baseObject, baseOffset, recordLength,
sortedRecords.getKeyPrefix());
+ // We only write out contents of the inMemSorter if it is not empty.
+ if (inMemSorter.numRecords() > 0) {
+ final UnsafeSorterSpillWriter spillWriter =
+ new UnsafeSorterSpillWriter(blockManager, fileBufferSizeBytes,
writeMetrics,
+ inMemSorter.numRecords());
+ spillWriters.add(spillWriter);
+ final UnsafeSorterIterator sortedRecords =
inMemSorter.getSortedIterator();
+ while (sortedRecords.hasNext()) {
+ sortedRecords.loadNext();
+ final Object baseObject = sortedRecords.getBaseObject();
+ final long baseOffset = sortedRecords.getBaseOffset();
+ final int recordLength = sortedRecords.getRecordLength();
+ spillWriter.write(baseObject, baseOffset, recordLength,
sortedRecords.getKeyPrefix());
+ }
+ spillWriter.close();
+ final long spillSize = freeMemory();
--- End diff --
Actually, one comment: should this be outside of the `if` condition? I'm
not sure what happens if you call `initializeForWriting()` in a case where you
haven't already called `freeMemory()`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]