otterc commented on a change in pull request #32287:
URL: https://github.com/apache/spark/pull/32287#discussion_r625287419
##########
File path:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
##########
@@ -245,9 +253,21 @@ final class ShuffleBlockFetcherIterator(
case FetchBlockInfo(blockId, size, mapIndex) => (blockId.toString,
(size, mapIndex))
}.toMap
val remainingBlocks = new HashSet[String]() ++= infoMap.keys
+ val deferredBlocks = new ArrayBuffer[String]()
val blockIds = req.blocks.map(_.blockId.toString)
val address = req.address
+ @inline def enqueueDeferredFetchRequestIfNecessary(): Unit = {
+ if (remainingBlocks.isEmpty && deferredBlocks.nonEmpty) {
+ val blocks = deferredBlocks.map { blockId =>
+ val (size, mapIndex) = infoMap(blockId)
+ FetchBlockInfo(BlockId(blockId), size, mapIndex)
+ }
+ results.put(DeferFetchRequestResult(FetchRequest(address,
blocks.toSeq)))
+ deferredBlocks.clear()
+ }
+ }
Review comment:
This is related to the conversation here:
https://github.com/apache/spark/pull/32287#discussion_r623425119
We were discussing simple ways to reduce the number of remote fetch requests
after the OOM. One such thing could be that after the OOM, we just sent the out
the requests that were deferred due to the OOM and not send any additional
requests.
I am not sure how effective is this going to be though. Since the in-flight
limit remains the same, the next call to `fetchUpToMaxBytes` when it sends out
non-deferred requests can cause new blocks to OOM.
Another simple way could be to modify `isRemoteBlockFetchable` such that
after this iterator has seen an OOM, it will also check `bytesInFlight +
fetchReqQueue.front.size` < freeDirectorMemory?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]