Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/10045#discussion_r46319598
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -58,14 +58,21 @@ private[spark] class TaskSetManager(
val conf = sched.sc.conf
- /*
- * Sometimes if an executor is dead or in an otherwise invalid state,
the driver
- * does not realize right away leading to repeated task failures. If
enabled,
- * this temporarily prevents a task from re-launching on an executor
where
- * it just failed.
+ /**
+ * This timeout (specified in milliseconds) is used to prevent tasks
from being immediatley
+ * re-launched on an executor where the task has already failed. A task
will not be re-launched
+ * on an executor where it has already failed until this amount of time
has elapsed since the
+ * failure. One example of when this is useful is if an executor is in
dead and the driver
--- End diff --
extra "in". (or maybe it was supposed to be "in fact")
I might change to talk about misconfigured executors, that appear to be
alive, but can't run tasks. At least thats the use case I've seen for this.
But no big deal really.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]