akshatb1 commented on a change in pull request #28258:
URL: https://github.com/apache/spark/pull/28258#discussion_r422640716
##########
File path: core/src/main/scala/org/apache/spark/deploy/Client.scala
##########
@@ -124,38 +127,57 @@ private class ClientEndpoint(
}
}
- /* Find out driver status then exit the JVM */
+ /**
+ * Find out driver status then exit the JVM. If the waitAppCompletion is set
to true, monitors
+ * the application until it finishes, fails or is killed.
+ */
def pollAndReportStatus(driverId: String): Unit = {
// Since ClientEndpoint is the only RpcEndpoint in the process, blocking
the event loop thread
// is fine.
logInfo("... waiting before polling master for driver state")
Thread.sleep(5000)
logInfo("... polling master for driver state")
- val statusResponse =
-
activeMasterEndpoint.askSync[DriverStatusResponse](RequestDriverStatus(driverId))
- if (statusResponse.found) {
- logInfo(s"State of $driverId is ${statusResponse.state.get}")
- // Worker node, if present
- (statusResponse.workerId, statusResponse.workerHostPort,
statusResponse.state) match {
- case (Some(id), Some(hostPort), Some(DriverState.RUNNING)) =>
- logInfo(s"Driver running on $hostPort ($id)")
- case _ =>
- }
- // Exception, if present
- statusResponse.exception match {
- case Some(e) =>
- logError(s"Exception from cluster was: $e")
- e.printStackTrace()
- System.exit(-1)
- case _ =>
- System.exit(0)
+ while (true) {
Review comment:
Hi @Ngone51 , I tried putting periodic messages in the loop in
`pollAndReportStatus` but it doesn't seem to receive message until the loop
sending is completed (checked with a `for` loop, will be stuck in an infinite
loop in case of current `while(true)` loop). Hence, I have implemented it based
on sending an async message from the `pollAndReportStatus` method and if need
be, send the message again while receiving the message. Please let me know what
you think of this approach. I have tested for the common scenarios and I could
see `onNetworkError` method getting called on shutting down Spark master when
an application is running.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]