Github user squito commented on the issue:
https://github.com/apache/spark/pull/21558
IMO your change is the right fix, not just a workaround. I don't think its
a scheduler bug (though its definitely unclear). I'll move that discussion to
the jira.
An alternative would be using `context.StageAttemptNumber()` and
`context.attemptNumber()` together.
> the stage ID can change for a different execution of the same stage,
IIRC, and that would reset the attempt id.
hmm, the only place I could imagine that happening is with a shared shuffle
dependency between jobs, which gets renumbered and then skipped, but then
perhaps re-executed on a fetch-failure. That isn't relevant here, though,
since it would only be shuffle map stages, not result stages.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]