[ https://issues.apache.org/jira/browse/MAPREDUCE-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057067#comment-14057067 ]
Zhijie Shen commented on MAPREDUCE-5956: ---------------------------------------- Talked to Wangda offline. Basically, the current solution is not to compute isLastRetry at the beginning of MR AM lifecycle, and keep it false (except some corner cases). In this way, no matter the scenario a, b or c. MR AM is going to have a retry upon failure, and RM will decide whether the MR job still has a chance with preemption considered. Therefore, MR AM will always not lose the retry chance it should have, but it trades off the problem that the staging dir is not going to be cleaned up at the real last retry, which is going to be taken care of by YARN-2261. It's now clear to me, thanks Wangda! And +1 for the plan. > MapReduce AM should not use maxAttempts to determine if this is the last retry > ------------------------------------------------------------------------------ > > Key: MAPREDUCE-5956 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5956 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: applicationmaster, mrv2 > Reporter: Vinod Kumar Vavilapalli > Assignee: Wangda Tan > Priority: Blocker > > Found this while reviewing YARN-2074. The problem is that after YARN-2074, we > don't count AM preemption towards AM failures on RM side, but MapReduce AM > itself checks the attempt id against the max-attempt count to determine if > this is the last attempt. > {code} > public void computeIsLastAMRetry() { > isLastAMRetry = appAttemptID.getAttemptId() >= maxAppAttempts; > } > {code} > This causes issues w.r.t deletion of staging directory etc.. -- This message was sent by Atlassian JIRA (v6.2#6252)