Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/159#issuecomment-37775065
For FetchFailed, the entire task set gets failed anyway and no new tasks
for that task set will be launched (because the previous stage needs to be
re-run), so blacklisting the executor won't have any effect.
On Sun, Mar 16, 2014 at 4:18 PM, Mridul Muralidharan <
[email protected]> wrote:
> In case of FetchFailed, it does not help to blacklist the executor (since
> failure was not local to that executor).
> TaskKilled is something I am unsure of - either is fine I guess. I am not
> sure in what all circumstances it will get fired; and if any of them can
be
> due to running the task in node issues. Any thoughts ?
>
> Note that when the number of executors are reasonably high, blacklisting
> some for a task would be ok - but when number of executors is low and
> number of tasks per stage is not too high (or towards end of a stage),
> aggressively marking executors as failed can slow the stage down a lot
from
> our observation.
>
> --
> Reply to this email directly or view it on
GitHub<https://github.com/apache/spark/pull/159#issuecomment-37774958>
> .
>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---