Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/16189#discussion_r92005198
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -161,12 +163,7 @@ private[spark] class Executor(
* @param interruptThread whether to interrupt the task thread
*/
def killAllTasks(interruptThread: Boolean) : Unit = {
- // kill all the running tasks
- for (taskRunner <- runningTasks.values().asScala) {
- if (taskRunner != null) {
- taskRunner.kill(interruptThread)
- }
- }
+ runningTasks.keys().asScala.foreach(t => killTask(t, interruptThread =
interruptThread))
--- End diff --
Yes, it is not at all a likely case. What is the most likely case does
concern me, though.
The default value for `spark.job.interruptOnCancel` is `false`; and from
the little bit of discussion on SPARK-17064, it doesn't seem that it will be
very easy to change that with confidence. That means that `interruptThread`
will also be `false` except for the hard-coded `true` in `handleSuccessfulTask`
(which is itself suspect). The way that I am understanding the default
behavior of this PR, then, is that if the reaper functionality is enabled, any
kill request of a task (other than from within `handleSuccessfulTask`) that
doesn't manage to complete within `killTimeoutMs` of that kill request will
result in the Executor's JVM being killed -- i.e. the most likely default
behavior looks to boil down to `if(killTask) killJvmAfterKillTimeoutMs`. That
is potentially expensive in terms of lost state from that Executor. If the
external shuffle service is being used, then we shouldn't need to lose the
state of the shuffle files; and if one of the external block storage options is
being used, then we shouldn't need to lose state on cached RDDs/tables,
broadcast variables and/or accumulators. But none of that externalization and
persistence of state across Executor restarts is currently the default -- and
that concerns me some.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]