Github user rdblue commented on the issue:

    https://github.com/apache/spark/pull/19269
  
    > There is no restriction to let the output of data writers be visible to 
other writers, so it's possible to launch a write task just for cleaning up the 
data of other writers.
    
    Agreed. Other writers can possibly see and change the data.
    
    I'm not sure if you intend this to cover some aspect of abort, but this 
wouldn't necessarily help with cleanup if you are. In my use case, a writer may 
produce a file in any partition in S3 depending on the data it writes. To clean 
up after another writer, I'd have to go through all of the input data to get 
partitions, then go list those partition locations for uncommitted files. It's 
not really practical to clean up this way so I don't think it could be a 
substitute for passing commit messages to job abort.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to