[GitHub] [spark] agrawaldevesh commented on a change in pull request #28818: [WIP][SPARK-31198][CORE] Use graceful decommissioning as part of dynamic scaling

GitBox Tue, 21 Jul 2020 13:55:15 -0700


agrawaldevesh commented on a change in pull request #28818:
URL: https://github.com/apache/spark/pull/28818#discussion_r458381415




##########
File path: core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala
##########
@@ -81,6 +81,35 @@ private[spark] trait ExecutorAllocationClient {
     countFailures: Boolean,
     force: Boolean = false): Seq[String]
 
+  /**
+   * Request that the cluster manager decommission the specified executors.
+   * Default implementation delegates to kill, scheduler must override
+   * if it supports graceful decommissioning.
+   *
+   * @param executorIds identifiers of executors to decommission
+   * @param adjustTargetNumExecutors whether the target number of executors 
will be adjusted down
+   *                                 after these executors have been 
decommissioned.
+   * @return the ids of the executors acknowledged by the cluster manager to 
be removed.
+   */
+  def decommissionExecutors(
+    executorIds: Seq[String],
+    adjustTargetNumExecutors: Boolean): Seq[String]
+
+
+  /**
+   * Request that the cluster manager decommission the specified executors.
+   * Default implementation delegates to kill, scheduler must override
+   * if it supports graceful decommissioning.
+   *
+   * @param executorIds identifiers of executors to decommission
+   * @return whether the request is acknowledged by the cluster manager.
+   */
+  def decommissionExecutor(executorId: String): Boolean = {
+    val decommissionedExecutors = decommissionExecutors(Seq(executorId),
+      adjustTargetNumExecutors = true)

Review comment:
       Hmm, Should we rename decommissionExecutor (singular) to 
decommissionAndKillExecutor to reflect its purpose better ? It would be too 
easy to confuse it with decommissionExecutors (on line 95 of this file which 
allows to not replenish the target number of executors).
   
   Do you want to make the change to the callers of decommissionExecutor in 
this PR and switch them to `decommissioExecutors(Seq(executorId), false)` 
instead. The ones I am most concerned about are:
   
   - The handling of message DecommissionExecutor (both sync and async 
variants) in CoarseGrainedSchedulerBackend
   - StandaloneSchedulerBackend.executorDecommissioned
   
   In both the above cases, I think we may not _always_ want replenishing. For 
example, in the standalone case, when the Worker gets a SIGPWR -- do we want to 
replenish the executors on the remaining workers (ie oversubscribe the 
remaining workers) ? Similarly when an executor gets a SIGPWR, do we want to 
put that load on the remaining executors ? I think the answer to both should be 
NO unless we are doing a dynamic allocation.
   
   Personally I am fine with any choice of naming here as long as the semantics 
are not silently changed under the cover, as is the case presently. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] agrawaldevesh commented on a change in pull request #28818: [WIP][SPARK-31198][CORE] Use graceful decommissioning as part of dynamic scaling

Reply via email to