Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19848
> Check if the same jobId already is committed and then remove existing
files and commit again.
if your job doesn't allow overwrite, that's mostly implicit; it's only in
concurrent/sequences of RDD writes where theSaveMode policy != ErrorIfExists.
Which is the default. AFAIK
If someone explicitly sets of two independent spark contexts writing RDDs
to the same dest with some policy other than that, well: it's pretty ambiguous
what's going to happen anyway, so saying "dont do that, then" is probably
defensible.
What if you have >1 RDD write of a failed and restarted spark Context?
That's my sole concern.
But here, at least in YARN: fail of the AM will trigger release of all
containers of workers round the cluster; except in the failure mode "Node
manager isolated along with worker and HA YARN turned on so NMs don't panic if
RM is unreachable for a while".
This is all pretty convoluted. I think the only think I care about is "can
things be confident that if there are partitioned workers from a previous job,
they won't interfere with the work being done by a successor, or, if they do
interfere, its manifest as a failure, rather some silent corruption of output
failure"
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]