Ngone51 commented on a change in pull request #32287:
URL: https://github.com/apache/spark/pull/32287#discussion_r623686357
##########
File path:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
##########
@@ -708,6 +785,15 @@ final class ShuffleBlockFetcherIterator(
}
private def fetchUpToMaxBytes(): Unit = {
+ if (isNettyOOMOnShuffle.get()) {
+ if (reqsInFlight > 0) {
+ // Return immediately if Netty is still OOMed and there're ongoing
fetch requests
+ return
+ } else {
+ ShuffleBlockFetcherIterator.resetNettyOOMFlagIfPossible(0)
+ }
+ }
+
Review comment:
So I have limited the reset condition to whether the Netty free memory
is larger than `maxReqSizeShuffleToMem` (default 200M), which is more strict
than the `averageBlockSize`. I think this would mitigate the issue you
mentioned here. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]