cloud-fan commented on a change in pull request #31348:
URL: https://github.com/apache/spark/pull/31348#discussion_r578388870



##########
File path: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
##########
@@ -750,6 +760,46 @@ private[deploy] class Worker(
     }
   }
 
+  /**
+   * Send `ExecutorStateChanged` to the current master. Unlike `sendToMaster`, 
we use `askSync`
+   * to send the message in order to ensure Master can receive the message.
+   */
+  private def syncExecutorStateWithMaster(newState: ExecutorStateChanged): 
Unit = {
+    master match {
+      case Some(masterRef) =>
+        val fullId = s"${newState.appId}/${newState.execId}"
+        try {
+          // SPARK-34245: We used async `send` to send the state previously. 
In that case, the
+          // finished executor can be leaked if Worker fails to send 
`ExecutorStateChanged`
+          // message to Master due to some unexpected errors, e.g., temporary 
network error.
+          // In the worst case, the application can get hang if the leaked 
executor is the only
+          // or last executor for the application. Therefore, we switch to 
`askSync` to ensure
+          // the state is handled by Master.
+          masterRef.askSync[Boolean](newState)
+          executorStateSyncFailureAttempts.remove(fullId)
+        } catch {
+          case t: Throwable =>
+            val failures = 
executorStateSyncFailureAttempts.getOrElseUpdate(fullId, 0) + 1

Review comment:
       seems like we just need `getOrElse(fullId, 0) + 1`, as we will update 
the map later via `executorStateSyncFailureAttempts(fullId) = failures`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to