[
https://issues.apache.org/jira/browse/MAPREDUCE-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158865#comment-13158865
]
Mahadev konar commented on MAPREDUCE-3460:
------------------------------------------
Great.
Thanks Hitesh.
Bobby, can you try it out and see if you can add a test case.
As for the long term goal of cleaning up the if then else, we'll have to give
it some thought before we go there. Hopefully 0.23 will be stable soon.
> MR AM can hang if containers are allocated on a node blacklisted by the AM
> --------------------------------------------------------------------------
>
> Key: MAPREDUCE-3460
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3460
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mr-am, mrv2
> Affects Versions: 0.23.0
> Reporter: Siddharth Seth
> Priority: Blocker
>
> When an AM is assigned a FAILED_MAP (priority = 5) container on a nodemanager
> which it has blacklisted - it tries to
> find a corresponding container request.
> This uses the hostname to find the matching container request - and can end
> up returning any of the ContainerRequests which may have requested a
> container on this node. This container request is cleaned to remove the bad
> node - and then added back to the RM 'ask' list.
> The AM cleans the 'ask' list after each heartbeat - The RM Allocator is still
> aware of the priority=5 container (in 'remoteRequestsTable') - but this never
> gets added back to the 'ask' set - which is what is sent to the RM.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira