[jira] [Commented] (MAPREDUCE-5086) MR app master deletes staging dir when sent a reboot command from the RM

Bikas Saha (JIRA) Wed, 20 Mar 2013 11:29:16 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607962#comment-13607962
 ]


Bikas Saha commented on MAPREDUCE-5086:
---------------------------------------

Lets add a log when we are changing the value of isLastRetry. Also, the "Notify 
JHEH" log should go into notifyIsLastAMRetry() as it describes the specific 
actions happening inside that method.
Also, it will be useful to add a comment before the shutdownJob() methods 
explaining that we are passing in the job object so that it can be overridden 
for tests.
{code}
-      //We are finishing cleanly so this is the last retry
-      isLastAMRetry = true;
+      //if isLastAMRetry comes as true, should never set it to false
+      if ( !isLastAMRetry){
+        if ( jobImpl.getInternalState() != JobStateInternal.REBOOT) {
+          //We are finishing cleanly so this is the last retry
+          isLastAMRetry = true;
+        }
+      }
+     // Notify the JHEH and RMCommunicator whether this is lastAMRetry
+      LOG.info("Notify JHEH and RMCommunicator isAMLastRetry: " + 
isLastAMRetry);
+      notifyIsLastAMRetry(isLastAMRetry);
       // Stop all services
{code}

There are random spurious newlines/spaces in the patch that need to be removed.
{code}
   protected UserGroupInformation currentUser; // Will be setup during init
-
+  
   private volatile boolean isLastAMRetry = false;
................
-              INTERNAL_ERROR_TRANSITION)
+              INTERNAL_ERROR_TRANSITION) 
{code}

You mean "job" and not "jb" right?
{code}
+  private static class InternalTerminationTransition implements
       SingleArcTransition<JobImpl, JobEvent> {
+    JobStateInternal terminationState = null;
+    String jbHistoryString = null;
{code}

Is this required in all the tests? 
{code}
+     jobid.setAppId(appId);
+     ContainerAllocator mockAlloc = mock(ContainerAllocator.class);
{code}
                
> MR app master deletes staging dir when sent a reboot command from the RM
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5086
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5086
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: jian he
>            Assignee: jian he
>         Attachments: YARN-472.1.patch, YARN-472.2.patch
>
>
> If the RM is restarted when the MR job is running, then it sends a reboot 
> command to the job. The job ends up deleting the staging dir and that causes 
> the next attempt to fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5086) MR app master deletes staging dir when sent a reboot command from the RM

Reply via email to