[
https://issues.apache.org/jira/browse/MAPREDUCE-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mayank Bansal updated MAPREDUCE-5900:
-------------------------------------
Description:
When YARN reports a completed container to the MR AM, it always interprets it
as a failure. This can lead to a job failing because too many of its tasks
failed, when in fact they only failed because the scheduler preempted them.
MR needs to recognize the special exit code value of -102 and interpret it as a
container being killed instead of a container failure.
was:
When YARN reports a completed container to the MR AM, it always interprets it
as a failure. This can lead to a job failing because too many of its tasks
failed, when in fact they only failed because the scheduler preempted them.
MR needs to recognize the special exit code value of -100 and interpret it as a
container being killed instead of a container failure.
> Container preemption interpreted as task failures and eventually job failures
> ------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5900
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5900
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: applicationmaster, mr-am, mrv2
> Affects Versions: 2.0.2-alpha
> Reporter: Mayank Bansal
> Assignee: Sandy Ryza
> Fix For: 2.1.0-beta
>
>
> When YARN reports a completed container to the MR AM, it always interprets it
> as a failure. This can lead to a job failing because too many of its tasks
> failed, when in fact they only failed because the scheduler preempted them.
> MR needs to recognize the special exit code value of -102 and interpret it as
> a container being killed instead of a container failure.
--
This message was sent by Atlassian JIRA
(v6.2#6252)