[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975608#comment-13975608
 ] 

Jason Lowe commented on MAPREDUCE-5848:
---------------------------------------

bq. On the positive side, the AM should know the containers was on the 
short-list to be killed from previous preemption messages it received
so maybe it could count a failure of a container "doomed" by preemption as a 
kill? Or simply postpone the decision on FAIL/KILL. Not sure...

Yes, the AM should definitely know, and I think the change in the patch is good 
just not sufficient.

As for postponing the decision, we may have to do just that.  To resolve the 
general case of SIGTERM potentially causing failures in the task which should 
be ignored in light of the kill, the AM may need to wait until it receives the 
container status from the RM to distinguish the cases.  Haven't thought through 
all of the ramifications of doing that, and I suspect there could be some long 
delays for some corner cases (e.g.: node fails as task fails, takes the RM a 
while to expire the node in order to send the container status).

> MapReduce counts forcibly preempted containers as FAILED
> --------------------------------------------------------
>
>                 Key: MAPREDUCE-5848
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5848
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.1.0-beta
>            Reporter: Carlo Curino
>            Assignee: Subramaniam Krishnan
>         Attachments: YARN-1958.patch
>
>
> The MapReduce AM is considering a forcibly preempted container as FAILED, 
> while I think it should be considered as KILLED (i.e., not count against the 
> maximum number of failures). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to