[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058110#comment-14058110
 ] 

Wangda Tan commented on MAPREDUCE-5956:
---------------------------------------

Hi [~mayank_bansal] and [~hitesh],
Currently, YARN NM will first send SIGTERM then sleep for a while (default is 
250ms, set by yarn.nodemanager.sleep-delay-before-sigkill.ms) send SIGKILL if 
process still alive when trying to kill a container.
MR shutdown hook can catch SIGTERM. So in Hitesh's status, if AM OOM at last 
retry and killed by NM ContainersMonitor, AM will not do cleanup. If AM is not 
last attempt, it will be restarted by RM.
Thanks,
Wangda


> MapReduce AM should not use maxAttempts to determine if this is the last retry
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5956
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5956
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: applicationmaster, mrv2
>    Affects Versions: 2.4.0
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Wangda Tan
>            Priority: Blocker
>         Attachments: MR-5956.patch
>
>
> Found this while reviewing YARN-2074. The problem is that after YARN-2074, we 
> don't count AM preemption towards AM failures on RM side, but MapReduce AM 
> itself checks the attempt id against the max-attempt count to determine if 
> this is the last attempt.
> {code}
>     public void computeIsLastAMRetry() {
>       isLastAMRetry = appAttemptID.getAttemptId() >= maxAppAttempts;
>     }
> {code}
> This causes issues w.r.t deletion of staging directory etc..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to