otterc commented on a change in pull request #32287:
URL: https://github.com/apache/spark/pull/32287#discussion_r625919892
##########
File path:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
##########
@@ -245,9 +253,21 @@ final class ShuffleBlockFetcherIterator(
case FetchBlockInfo(blockId, size, mapIndex) => (blockId.toString,
(size, mapIndex))
}.toMap
val remainingBlocks = new HashSet[String]() ++= infoMap.keys
+ val deferredBlocks = new ArrayBuffer[String]()
val blockIds = req.blocks.map(_.blockId.toString)
val address = req.address
+ @inline def enqueueDeferredFetchRequestIfNecessary(): Unit = {
+ if (remainingBlocks.isEmpty && deferredBlocks.nonEmpty) {
+ val blocks = deferredBlocks.map { blockId =>
+ val (size, mapIndex) = infoMap(blockId)
+ FetchBlockInfo(BlockId(blockId), size, mapIndex)
+ }
+ results.put(DeferFetchRequestResult(FetchRequest(address,
blocks.toSeq)))
+ deferredBlocks.clear()
+ }
+ }
Review comment:
> I have changed the unset condition to freeDirectorMemory >
maxReqSizeShuffleToMem (200M by default), which I think is already very strict.
So, it should avoid the issue you mentioned in #32287 (comment).
I don't think it will avoid the issue. This adds more time when the next set
of requests are going to be set. However, the next set of requests (including
the deferred ones) will still be sent at same frequency, so some of them would
again see OOMs.
> Do you mean check bytesInFlight + fetchReqQueue.front.size <
freeDirectorMemory for all the cases or only when isNettyOOMOnShuffle=true?
I meant that if once this OOM is encountered, after that the iterator checks
against freeDirectorMemory as well. If a request.size >
`maxReqSizeShuffleToMem` then we can skip the check on it as that is stored on
disk.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]