Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    for a write job, the writing only happens at the last stage, so stage id 
doesn't matter, and data source v2 API assumes `(job id, partition id, task 
attemp id)` can uniquely define a write task, even counting the failure cases.
    
    So the problem here is, when we retry a stage, Spark doesn't kill the tasks 
of the old stage and just launch tasks for the new stage. We may have running 
write tasks that have the same `(job id, partition id, task attemp id)`.
    
    One solution is, we can just use `(job id, task id)` to uniquely define a 
write task. But we may lose information like the index of this task and how 
many times it has been retried.
    
    Or we can use `(job id, partition id, stage attemp id, task attemp id)`, 
which is a little verbose, but has all the information.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to