Ngone51 commented on a change in pull request #32287:
URL: https://github.com/apache/spark/pull/32287#discussion_r629241186



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -1193,6 +1193,15 @@ package object config {
       .intConf
       .createWithDefault(3)
 
+  private[spark] val SHUFFLE_MAX_ATTEMPTS_ON_NETTY_OOM =
+    ConfigBuilder("spark.shuffle.nettyOOM.maxAttempts")
+      .doc("The max attempts of a shuffle block would retry on Netty OOM issue 
before throwing " +
+        "the shuffle fetch failure.")
+      .version("3.2.0")
+      .internal()
+      .intConf
+      .createWithDefault(10)

Review comment:
       Given the discussion 
(https://github.com/apache/spark/pull/32287#discussion_r625287419) there, Netty 
OOM could be raised more frequently in certain cases, e.g.,
   
   > For case b), the OOM threshold might be 20 requests. In this case, 
there're still 80 deferred requests, which would hit the OOM soon as you 
mentioned. That being said, I think the current fix would work around the issue 
in the end. Note that the application would fail before the fix.
   
   Thus, I'd like to give more chances for the block in case we fall into the 
case like b).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to