[
https://issues.apache.org/jira/browse/MAPREDUCE-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607962#comment-13607962
]
Bikas Saha commented on MAPREDUCE-5086:
---------------------------------------
Lets add a log when we are changing the value of isLastRetry. Also, the "Notify
JHEH" log should go into notifyIsLastAMRetry() as it describes the specific
actions happening inside that method.
Also, it will be useful to add a comment before the shutdownJob() methods
explaining that we are passing in the job object so that it can be overridden
for tests.
{code}
- //We are finishing cleanly so this is the last retry
- isLastAMRetry = true;
+ //if isLastAMRetry comes as true, should never set it to false
+ if ( !isLastAMRetry){
+ if ( jobImpl.getInternalState() != JobStateInternal.REBOOT) {
+ //We are finishing cleanly so this is the last retry
+ isLastAMRetry = true;
+ }
+ }
+ // Notify the JHEH and RMCommunicator whether this is lastAMRetry
+ LOG.info("Notify JHEH and RMCommunicator isAMLastRetry: " +
isLastAMRetry);
+ notifyIsLastAMRetry(isLastAMRetry);
// Stop all services
{code}
There are random spurious newlines/spaces in the patch that need to be removed.
{code}
protected UserGroupInformation currentUser; // Will be setup during init
-
+
private volatile boolean isLastAMRetry = false;
................
- INTERNAL_ERROR_TRANSITION)
+ INTERNAL_ERROR_TRANSITION)
{code}
You mean "job" and not "jb" right?
{code}
+ private static class InternalTerminationTransition implements
SingleArcTransition<JobImpl, JobEvent> {
+ JobStateInternal terminationState = null;
+ String jbHistoryString = null;
{code}
Is this required in all the tests?
{code}
+ jobid.setAppId(appId);
+ ContainerAllocator mockAlloc = mock(ContainerAllocator.class);
{code}
> MR app master deletes staging dir when sent a reboot command from the RM
> ------------------------------------------------------------------------
>
> Key: MAPREDUCE-5086
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5086
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: jian he
> Assignee: jian he
> Attachments: YARN-472.1.patch, YARN-472.2.patch
>
>
> If the RM is restarted when the MR job is running, then it sends a reboot
> command to the job. The job ends up deleting the staging dir and that causes
> the next attempt to fail.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira