Github user davies commented on the pull request:
https://github.com/apache/spark/pull/10045#issuecomment-160763214
@kayousterhout The number of available executors could change all the time,
I'm thinking of a small change that permanently blacklist a executor for second
failure, we never schedule a task to a executor, if the task had failed on the
executor twice.
The patch could be small,:
```
- failedExecutors.getOrElseUpdate(index, new HashMap[String, Long]()).
- put(info.executorId, clock.getTimeMillis())
+ val failed = failedExecutors.getOrElseUpdate(index, new
HashMap[String, Long]())
+ if (failed.contains(info.executorId)) {
+ // never try the executor again if the task failed on it twice
+ failed.put(info.executorId, Int.MaxValue)
+ } else {
+ failed.put(info.executorId, clock.getTimeMillis())
+ }
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]