steveloughran commented on pull request #29895:
URL: https://github.com/apache/spark/pull/29895#issuecomment-700774332


   FWIW I'm going to change the default to be v1, and log @ WARN in job set up 
when you use v2 (unless you turn that specific log off). V2 is used in places 
where people have hit the scale limits with v1, and they are happy with the 
risk of failures. Note that if your job doesn't generate unique files with each 
task attempt, even without atomic task commit the output is correct. The danger 
is when when you get one or more of
   
   * different task attempts generating files with different names
   * a requirement of all output files of a task to consist entirely and 
exclusively of a single task attempt.
   
   If your attempts are 100% deterministic, you are going to be safe.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to