Github user squito commented on the issue:
https://github.com/apache/spark/pull/20987
> Things I'm concerned about is that does there exists a situation like 'a
task gets killed after it gets a FetchFailure, but re-run later and gets a
FetchFailure too without TaskKilledException' (or this fix against speculative
tasks only).
I don't think its worth trying to be fancy here -- in almost all
situations, we don't care about the fetch failure handling when the task is
killed. Even if this task is not speculative, it could be that its killed
because *another* speculative task finished, and so this one gets aborted.
Suppose there is a real fetch failure, and you just happen to kill the task
*just* after that. Since the task was killed, that means you don't really care
about the input shuffle data at the moment in any case. You *might* run
another job later on which tries to read the same shuffle data, and then it'll
have to rediscover the missing shuffle data. But, thats about it. oh well.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]