[GitHub] [spark] Ngone51 commented on a change in pull request #28258: [SPARK-31486] [CORE] spark.submit.waitAppCompletion flag to control spark-submit exit in Standalone Cluster Mode

GitBox Sun, 10 May 2020 19:39:09 -0700


Ngone51 commented on a change in pull request #28258:
URL: https://github.com/apache/spark/pull/28258#discussion_r422744560




##########
File path: core/src/main/scala/org/apache/spark/deploy/Client.scala
##########
@@ -176,6 +190,25 @@ private class ClientEndpoint(
       } else if (!Utils.responseFromBackup(message)) {
         System.exit(-1)
       }
+
+    case DriverStatusResponse(found, state, _, _, _) =>
+      if (found) {
+        state.get match {
+          case DriverState.FINISHED | DriverState.FAILED |
+               DriverState.ERROR | DriverState.KILLED =>
+            logInfo(s"State of $submittedDriverID is ${state.get}, " +

Review comment:
       nit: s"State of **driver** $submittedDriverID ..."

##########
File path: core/src/main/scala/org/apache/spark/deploy/Client.scala
##########
@@ -176,6 +190,25 @@ private class ClientEndpoint(
       } else if (!Utils.responseFromBackup(message)) {
         System.exit(-1)
       }
+
+    case DriverStatusResponse(found, state, _, _, _) =>
+      if (found) {
+        state.get match {
+          case DriverState.FINISHED | DriverState.FAILED |
+               DriverState.ERROR | DriverState.KILLED =>
+            logInfo(s"State of $submittedDriverID is ${state.get}, " +
+              s"exiting spark-submit JVM.")
+            System.exit(0)
+          case _ =>
+            Thread.sleep(REPORT_DRIVER_STATUS_INTERVAL)
+            logInfo(s"State of $submittedDriverID is ${state.get}, " +

Review comment:
       nit: s"State of driver $submittedDriverID ..."

##########
File path: core/src/main/scala/org/apache/spark/deploy/Client.scala
##########
@@ -176,6 +190,25 @@ private class ClientEndpoint(
       } else if (!Utils.responseFromBackup(message)) {
         System.exit(-1)
       }
+
+    case DriverStatusResponse(found, state, _, _, _) =>
+      if (found) {
+        state.get match {
+          case DriverState.FINISHED | DriverState.FAILED |
+               DriverState.ERROR | DriverState.KILLED =>
+            logInfo(s"State of $submittedDriverID is ${state.get}, " +
+              s"exiting spark-submit JVM.")
+            System.exit(0)
+          case _ =>
+            Thread.sleep(REPORT_DRIVER_STATUS_INTERVAL)

Review comment:
       `Thread.sleep` could still has the same issue, imaging the network drop 
happens during sleeping. We should control the period sending logic out of 
`receive`. We could mimic `CoarseGrainedSchedulerBackend` to do the same work 
here:
   
   
https://github.com/apache/spark/blob/9faad07ce706890008a8a3ce675fa95b0bdf7c14/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L137-L139
 
   
   

##########
File path: core/src/main/scala/org/apache/spark/deploy/Client.scala
##########
@@ -176,6 +190,25 @@ private class ClientEndpoint(
       } else if (!Utils.responseFromBackup(message)) {
         System.exit(-1)
       }
+
+    case DriverStatusResponse(found, state, _, _, _) =>
+      if (found) {
+        state.get match {
+          case DriverState.FINISHED | DriverState.FAILED |
+               DriverState.ERROR | DriverState.KILLED =>
+            logInfo(s"State of $submittedDriverID is ${state.get}, " +
+              s"exiting spark-submit JVM.")
+            System.exit(0)
+          case _ =>
+            Thread.sleep(REPORT_DRIVER_STATUS_INTERVAL)
+            logInfo(s"State of $submittedDriverID is ${state.get}, " +

Review comment:
       Since status polling will happen every second, I'm afraid logs can be 
too verbose. We can log it after a constant polling times, e.g. log every 60 
times.

##########
File path: docs/spark-standalone.md
##########
@@ -374,6 +374,25 @@ To run an interactive Spark shell against the cluster, run 
the following command
 
 You can also pass an option `--total-executor-cores <numCores>` to control the 
number of cores that spark-shell uses on the cluster.
 
+# Client Properties
+
+Spark applications supports the following configuration properties specific to 
standalone mode: 
+
+<table class="table">
+  <tr><th style="width:21%">Property Name</th><th>Default 
Value</th><th>Meaning</th><th>Since Version</th></tr>
+  <tr>
+  <td><code>spark.standalone.submit.waitAppCompletion</code></td>
+  <td><code>false</code></td>
+  <td>
+  In standalone cluster mode, controls whether the client waits to exit until 
the application completes.
+  If set to <code>true</code>, the client process will stay alive polling the 
application's status.

Review comment:
       nit: `application's` or `driver`?

##########
File path: core/src/main/scala/org/apache/spark/deploy/Client.scala
##########
@@ -61,6 +61,10 @@ private class ClientEndpoint(
 
    private val lostMasters = new HashSet[RpcAddress]
    private var activeMasterEndpoint: RpcEndpointRef = null
+   private val waitAppCompletion = 
conf.getBoolean("spark.standalone.submit.waitAppCompletion",

Review comment:
       We usually use `ConfigEntry` for a new conf. Could you add it too? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] Ngone51 commented on a change in pull request #28258: [SPARK-31486] [CORE] spark.submit.waitAppCompletion flag to control spark-submit exit in Standalone Cluster Mode

Reply via email to