venkata91 commented on a change in pull request #30691:
URL: https://github.com/apache/spark/pull/30691#discussion_r626774584



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -2075,6 +2075,25 @@ package object config {
       .booleanConf
       .createWithDefault(false)
 
+  private[spark] val PUSH_BASED_SHUFFLE_MERGE_RESULTS_TIMEOUT =
+    ConfigBuilder("spark.shuffle.push.merge.results.timeout")
+      .doc("Specify the max amount of time DAGScheduler waits for the merge 
results from " +
+        "all remote shuffle services for a given shuffle. DAGScheduler will 
start to submit " +
+        "following stages if not all results are received within the timeout.")
+      .version("3.1.0")
+      .stringConf
+      .createWithDefault("10s")
+
+  private[spark] val PUSH_BASED_SHUFFLE_MERGE_FINALIZE_TIMEOUT =
+    ConfigBuilder("spark.shuffle.push.merge.finalize.timeout")
+      .doc("Specify the amount of time DAGScheduler waits after all mappers 
finish for " +
+        "a given shuffle map stage before it starts sending merge finalize 
requests to " +
+        "remote shuffle services. This allows the shuffle services some extra 
time to " +
+        "merge as many blocks as possible.")
+      .version("3.1.0")
+      .stringConf
+      .createWithDefault("10s")
+

Review comment:
       I think currently this is hard to tune but once we have the changes in 
for `SPARK-33701` which does adaptive merge finalization, mostly this should be 
taken care of.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to