mridulm commented on code in PR #41076:
URL: https://github.com/apache/spark/pull/41076#discussion_r1357432292


##########
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##########
@@ -2576,4 +2576,16 @@ package object config {
       .stringConf
       .toSequence
       .createWithDefault("org.apache.spark.sql.connect.client" :: Nil)
+
+  private[spark] val DECOMMISSION_MAX_RATIO =
+    ConfigBuilder("spark.decommission.maxRatio")
+      .doc("Max ratio of decommissioning executor of running executors. 
Decommission too many " +
+        s"executors at the same time with shuffle or rdd migration enabled " +
+        s"could hurt performance of shuffle fetch " +
+        s"as shuffle or rdd migration compete network and disk IO with shuffle 
fetch. " +
+        s"This only affects decommission triggered by driver.")
+      .version("3.5.0")
+      .doubleConf
+      .checkValue(ratio => ratio > 0 && ratio <= 1, "Decommission max ratio 
should > 0 and <= 1")
+      .createWithDefault(1)

Review Comment:
   Instead of a ratio, wondering if it makes more sense to have an absolute 
number instead.
   For example, if the number of active executors is small, a ratio will impact 
unnecessarily reduce the number of executors concurrently being decomm'ed.



##########
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##########
@@ -2576,4 +2576,16 @@ package object config {
       .stringConf
       .toSequence
       .createWithDefault("org.apache.spark.sql.connect.client" :: Nil)
+
+  private[spark] val DECOMMISSION_MAX_RATIO =
+    ConfigBuilder("spark.decommission.maxRatio")
+      .doc("Max ratio of decommissioning executor of running executors. 
Decommission too many " +
+        s"executors at the same time with shuffle or rdd migration enabled " +
+        s"could hurt performance of shuffle fetch " +
+        s"as shuffle or rdd migration compete network and disk IO with shuffle 
fetch. " +
+        s"This only affects decommission triggered by driver.")
+      .version("3.5.0")

Review Comment:
   3.5 -> 4.0



##########
core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala:
##########
@@ -137,6 +138,17 @@ private[spark] class ExecutorMonitor(
     timedOutExecs.sortBy(_._1)
   }
 
+  /** Return executors can be removed constrained by [[decommissionMaxRatio]] 
*/
+  def executorsCanBeRemoved(): Seq[(String, Int)] = {
+    if (hasStorageDecommission()) {
+      val maxExecutorsCanBeRemoved = math.floor(executorCount * 
decommissionMaxRatio
+        - decommissioningCount).toInt

Review Comment:
   If we are going with a ratio, make this a ceil and not floor - to prevent 
`maxExecutorsCanBeRemoved` from being `0`



##########
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##########
@@ -2576,4 +2576,16 @@ package object config {
       .stringConf
       .toSequence
       .createWithDefault("org.apache.spark.sql.connect.client" :: Nil)
+
+  private[spark] val DECOMMISSION_MAX_RATIO =
+    ConfigBuilder("spark.decommission.maxRatio")
+      .doc("Max ratio of decommissioning executor of running executors. 
Decommission too many " +
+        s"executors at the same time with shuffle or rdd migration enabled " +
+        s"could hurt performance of shuffle fetch " +
+        s"as shuffle or rdd migration compete network and disk IO with shuffle 
fetch. " +
+        s"This only affects decommission triggered by driver.")
+      .version("3.5.0")
+      .doubleConf
+      .checkValue(ratio => ratio > 0 && ratio <= 1, "Decommission max ratio 
should > 0 and <= 1")
+      .createWithDefault(1)

Review Comment:
   Instead of a ratio, wondering if it makes more sense to have an absolute 
number instead.
   For example, if the number of active executors is small, a ratio will impact 
unnecessarily reduce the number of executors concurrently being decomm'ed.
   
   +CC @dongjoon-hyun 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to