Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/7846#discussion_r36218909
--- Diff:
yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
---
@@ -131,24 +131,35 @@ private[spark] class YarnClientSchedulerBackend(
}
}
+ private class MonitorThread extends Thread {
+ private var doInterrupt = true
+
+ override def run() {
+ try {
+ val (state, _) = client.monitorApplication(appId,
logApplicationReport = false)
+ logError(s"Yarn application has already exited with state $state!")
+ doInterrupt = false
--- End diff --
I'd call this `allowInterrupt`.
So it too me a bit to understand why this code is like this. Basically when
you interrupt it's because the SparkContext is being shut down (`sc.stop()`
called by user code), and you do not want `sc.stop()` to be called again here.
Now if `monitorApplication()` returns, it means the YARN app finished before
`sc.stop()` was called, which means this code should call `sc.stop()`. Could
you write a small comment explaining that so that in the future people know
what's going on here?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]