Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/10319#discussion_r51471308
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala
---
@@ -364,7 +379,27 @@ private[spark] class CoarseMesosSchedulerBackend(
}
override def stop() {
- super.stop()
+ // Make sure we're not launching tasks during shutdown
+ stateLock.synchronized {
+ if (stopCalled) {
+ logWarning("Stop called multiple times, ignoring")
+ return
+ }
+ stopCalled = true
+ super.stop()
+ }
+ // Wait for finish
+ val stopwatch = new Stopwatch()
+ stopwatch.start()
+ // slaveIdsWithExecutors has no memory barrier, so this is eventually
consistent
+ while (slaveIdsWithExecutors.nonEmpty &&
+ stopwatch.elapsed(TimeUnit.MILLISECONDS) < shutdownTimeoutMS) {
+ Thread.sleep(100)
+ }
+ if(slaveIdsWithExecutors.nonEmpty) {
+ logWarning(s"${slaveIdsWithExecutors.size} executors still running. "
+ + "Proceeding with mesos driver stop.")
--- End diff --
I don't understand this warning message. I think you mean something more
like
```
Timed out on waiting for executors to terminate ($X still running) after
$timeout ms.
Proceeding to stop Mesos driver, which may lead to leftover temporary files
on the slaves.
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]