[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987061#comment-13987061
 ] 

Jason Lowe edited comment on MAPREDUCE-5848 at 5/1/14 10:11 PM:
----------------------------------------------------------------

Sure, I think that's a reasonable approach.  It's definitely an incremental 
improvement from what we have today.

The kill event should have a useful diagnostic message attached to it so users 
can tell that it was preempted, i.e.: instead of a raw TA_KILL event it should 
send a TaskAttemptKillEvent with a diagnostic.  Otherwise I think users will be 
confused as to why the task was killed (e.g.: was it preempted vs. aborted?) 
and may have to dig through a rather large AM log to sort that out.

It'd also be nice to add a unit test in TestRMContainerAllocator.


was (Author: jlowe):
Sure, I think that's a reasonable approach.  It's definitely an incremental 
improvement from what we have today.

The kill event have a useful diagnostic message attached to it so users can 
tell that it was preempted, i.e.: instead of a raw TA_KILL event it should send 
a TaskAttemptKillEvent with a diagnostic.  Otherwise I think users will be 
confused as to why the task was killed (e.g.: was it preempted vs. aborted?) 
and may have to dig through a rather large AM log to sort that out.

It'd also be nice to add a unit test in TestRMContainerAllocator.

> MapReduce counts forcibly preempted containers as FAILED
> --------------------------------------------------------
>
>                 Key: MAPREDUCE-5848
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5848
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.1.0-beta
>            Reporter: Carlo Curino
>            Assignee: Subramaniam Krishnan
>         Attachments: YARN-1958.patch
>
>
> The MapReduce AM is considering a forcibly preempted container as FAILED, 
> while I think it should be considered as KILLED (i.e., not count against the 
> maximum number of failures). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to