steveloughran commented on pull request #2349:
URL: https://github.com/apache/hadoop/pull/2349#issuecomment-702137474


   @jbrennan333 what do you think we should say instead of deprecated? "not 
recommended". 
   
   I was thinking of adding a link to the JIRA and changing the issue text 
there to clarify
   * safe if names and content of generated output files is consistent across 
all task attempts
   * unsafe if different TAs generate bad files (biggest risk, as partial 
failure of 1st attempt may leave)
   * unsafe if different TAs generate different content in same files (only an 
issue on a network partition and TA #1 generates output as/after TA #2 does its 
work.
   
   cleanup of job will delete the whole job attempt dir so that's the maximum 
time that a partitioned TA may commit work. There's no risk of some VM pausing 
for 3 hours, restarting and an in progress TA completing its work and 
overwriting the final output. This is good.
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to