turboFei removed a comment on issue #24142: [SPARK-27194][core] Job failures 
when task attempts do not clean up spark-staging parquet files
URL: https://github.com/apache/spark/pull/24142#issuecomment-541376155
 
 
   Hi, @vanzin  @ajithme  and @cloud-fan 
   We face this problem too.
   Sorry for that I did not notice this PR before and I create a new PR 
https://github.com/apache/spark/pull/26086.
   How about using this method to name a task file for dynamic partition 
overwrite only?
   
   Now, for a dynamic partition overwrite operation, the filename of a task 
output is determined by splitId(taskId) and jobId.
   So, if speculation is enabled, a task would conflict with its relative 
speculation task.
   We can make the file name of a task for dynamic partition overwrite be 
unique.
   And the outputCommitCoordinator would decide which task can commit.
   
   And for dynamic partition overwrite, it keeps a filesToMove set, which would 
not cause duplicate result.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to