Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21219#discussion_r185788654
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -407,6 +407,25 @@ final class ShuffleBlockFetcherIterator(
logDebug("Number of requests in flight " + reqsInFlight)
}
+ if (buf.size == 0) {
+ // We will never legitimately receive a zero-size block. All
blocks with zero records
+ // have zero size and all zero-size blocks have no records
(and hence should never
+ // have been requested in the first place). This statement
relies on behaviors of the
+ // shuffle writers, which are guaranteed by the following test
cases:
+ //
+ // - BypassMergeSortShuffleWriterSuite: "write with some empty
partitions"
+ // - UnsafeShuffleWriterSuite: "writeEmptyIterator"
+ // - DiskBlockObjectWriterSuite: "commit() and close() without
ever opening or writing"
+ //
+ // There is not an explicit test for SortShuffleWriter but the
underlying APIs that
+ // uses are shared by the UnsafeShuffleWriter (both writers
use DiskBlockObjectWriter
+ // which returns a zero-size from commitAndGet() in case the
no records were written
--- End diff --
Seems a typo `the no` btw.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]