Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/21558
for a write job, the writing only happens at the last stage, so stage id
doesn't matter, and data source v2 API assumes `(job id, partition id, task
attemp id)` can uniquely define a write task, even counting the failure cases.
So the problem here is, when we retry a stage, Spark doesn't kill the tasks
of the old stage and just launch tasks for the new stage. We may have running
write tasks that have the same `(job id, partition id, task attemp id)`.
One solution is, we can just use `(job id, task id)` to uniquely define a
write task. But we may lose information like the index of this task and how
many times it has been retried.
Or we can use `(job id, partition id, stage attemp id, task attemp id)`,
which is a little verbose, but has all the information.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]