otterc commented on a change in pull request #32287:
URL: https://github.com/apache/spark/pull/32287#discussion_r625287419



##########
File path: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
##########
@@ -245,9 +253,21 @@ final class ShuffleBlockFetcherIterator(
       case FetchBlockInfo(blockId, size, mapIndex) => (blockId.toString, 
(size, mapIndex))
     }.toMap
     val remainingBlocks = new HashSet[String]() ++= infoMap.keys
+    val deferredBlocks = new ArrayBuffer[String]()
     val blockIds = req.blocks.map(_.blockId.toString)
     val address = req.address
 
+    @inline def enqueueDeferredFetchRequestIfNecessary(): Unit = {
+      if (remainingBlocks.isEmpty && deferredBlocks.nonEmpty) {
+        val blocks = deferredBlocks.map { blockId =>
+          val (size, mapIndex) = infoMap(blockId)
+          FetchBlockInfo(BlockId(blockId), size, mapIndex)
+        }
+        results.put(DeferFetchRequestResult(FetchRequest(address, 
blocks.toSeq)))
+        deferredBlocks.clear()
+      }
+    }

Review comment:
       This is related to the conversation here:
   https://github.com/apache/spark/pull/32287#discussion_r623425119
   We were discussing simple ways to reduce the number of remote fetch requests 
after the OOM. One such thing could be that after the OOM, we just sent the out 
the requests that were deferred due to the OOM and not send any additional 
requests. 
   I am not sure how effective is this going to be though. Since the in-flight 
limit remains the same, the next call to `fetchUpToMaxBytes` when it sends out 
non-deferred requests can cause new blocks to OOM.
   
   Another simple way could be to modify `isRemoteBlockFetchable` such that 
after this iterator has seen an OOM, it will also check `bytesInFlight + 
fetchReqQueue.front.size` < freeDirectorMemory?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to