Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/17113
> Another thing I thought about as I was reviewing this -- spark currently
assumes that a fetchfailure is always the fault of the source, never the
destination. I almost wonder if we should count it against both, with some
sensible heuristic for looking at a collection of failures and deciding who is
at fault.
I think this makes sense to try to tell the difference and improve our
logic there.
> Perhaps we should think of a better way to choose the right behavior.
Does this mean you don't want to go with this approach? I'm actually not
sure this is a huge change. Its a decent change in behavior but for the cases
where nodes really do go down this could help a lot. I think Spark definitely
doesn't handle this case well now.
Sorry again I haven't done a full review been trying to think the entire
fetch failure scenarios through and have just been busy with other things.
One downside to adding the config BLACKLIST_FETCH_FAILURE_ENABLED limits us
on possibly changing this functionality to say only blacklist on multiple fetch
failures. We could track fetch failures across stage attempts and say only
after it fails X number of tasks which could be across stage attempts do we
blacklist it. I guess its marked as experimental so its a bit more ok for us
to change it. Perhaps X isn't just failed tasks but failed tasks from a % of
different hosts. You could potentially use that also to determine if the
source is bad or the destination. If many across hosts have failed then you
expect the source is bad, if its just one destination then perhaps that is bad.
Still thinking this through.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]