Ngone51 commented on a change in pull request #28746:
URL: https://github.com/apache/spark/pull/28746#discussion_r438151794
##########
File path: core/src/main/scala/org/apache/spark/deploy/LocalSparkCluster.scala
##########
@@ -74,6 +74,10 @@ class LocalSparkCluster(
def stop(): Unit = {
logInfo("Shutting down local Spark cluster.")
+ // SPARK-31922: wait one more second before shutting down rpcEnvs of
master and worker,
+ // in order to let the cluster have time to handle the
`UnregisterApplication` message.
+ // Otherwise, we could hit "RpcEnv already stopped" error.
+ Thread.sleep(1000)
Review comment:
> on my box the issue has not shown up after the PR making sure the
client is stopped only after finishApplication and reworking the shutdown logic
I guess the issue does not show because `askSync` to Master has increased a
little time for Worker to handle the messages comparing to `send`. But the fact
that Master sending messages asynchronously to Worker hasn't changed. So you
still can not make sure whether messages have been handled by Worker when stop
is called.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]