[ https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772301#comment-13772301 ]
Bikas Saha commented on MAPREDUCE-5505: --------------------------------------- Are we sure that previous state is always RUNNING before FAILED? {code} + case FAILED: + if (isUnregistered) { + return JobState.FAILED; + } else { + return JobState.RUNNING; {code} Instead of isUnregistered, let us create an AtomicBoolean called safeToReportTerminationToUser. Instead of JobImpl, this boolean can be made visible via the AppContext object so that everyone has access to it. When to set the boolean to true? We could do it in RMCommunicator after unregister succeeds (like in this patch). Or we can do it in MRClientService.serviceStop(). Since MRClientService is the last service to stop() we can be sure that everything finished nicely. MRClientService.serviceStop() can set the boolean. Then we can move the sleep(5sec) from MRAppMaster to MRClientService.serviceStop() after setting the boolean. We should leave a comment explaining this in MRAppMaster.shutdown() before the call to clientService.stop() so that its easy for someone else to track this logic. Please do run single node tests to verify the behavior for real along with RM restart. > Clients should be notified job finished only after job successfully > unregistered > --------------------------------------------------------------------------------- > > Key: MAPREDUCE-5505 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Jian He > Assignee: Zhijie Shen > Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch > > > This is to make sure user is notified job finished after job is really done. > This does increase client latency but can reduce some races during unregister > like YARN-540 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira