[
https://issues.apache.org/jira/browse/MAPREDUCE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287699#comment-13287699
]
Robert Joseph Evans commented on MAPREDUCE-4304:
------------------------------------------------
+1 this is not just for tiny clusters, it also happens on tiny queues.
> Deadlock where all containers are held by ApplicationMasters should be
> prevented
> --------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4304
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4304
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mrv2, resourcemanager
> Affects Versions: 0.23.1
> Reporter: Herman Chen
>
> In my test cluster with 4 NodeManagers, each with only ~1.6G container
> memory, when a burst of jobs, e.g. >10, are concurrently submitted, it is
> likely that 4 jobs are accepted, with 4 ApplicationMasters allocated, but
> then the jobs block each other indefinitely because they're all waiting to
> allocate more containers.
> Note that the problem is not limited to tiny cluster like this. As long as
> the number of jobs being submitted is greater than the rate jobs finish, it
> may run into a vicious cycle where more and more containers are locked up by
> ApplicationMasters.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira