[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771233#comment-13771233
 ] 

Zhijie Shen commented on MAPREDUCE-5505:
----------------------------------------

bq. I dont think making jobimpl aware of all this is a good idea.
The reason to make JobImpl to be aware of unregister is that JobImpl#getState() 
needs the info to decide whether it returns the final state (SUCCEEDED, FAILED, 
KILLED, ERROR) or the prior state (RUNNING). JobImpl#getState() is called by 
not only getReport() but also JobInfo. As I mentioned above, I did this in 
getState() to ensure client protocol and web UI can see the consistent state.

Therefore, in this case, we can move the getReport() JobImpl -> 
MRClientService, but it is not necessary.


bq. MRClientService is already the last thing to stop. So its a good place to 
contain this race condition handling logic.

+1. Agree it's a good place. When MRAppMaster#serviceStop is called, 
RMCommunicator#serviceStop is called before MRClientService#serviceStop, where 
unregister has already been executed. Therefore after this execution, it's 
already safe to notify the client/webUI of the final state of the job, because 
one AM has exact one job. 
                
> Clients should be notified job finished only after job successfully 
> unregistered 
> ---------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5505
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Jian He
>            Assignee: Zhijie Shen
>         Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch
>
>
> This is to make sure user is notified job finished after job is really done. 
> This does increase client latency but can reduce some races during unregister 
> like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to