[ https://issues.apache.org/jira/browse/MAPREDUCE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160066#comment-13160066 ]
Thomas Graves commented on MAPREDUCE-3483: ------------------------------------------ No I don't think they are the same. In this case, it was just that one node that the reducer shouldn't have been scheduled/reserved on. It could have been scheduled on any of the other nodes in the cluster as they (eventually) had enough memory. All the other nodes in the cluster may have been running maps from that particular job or tasks from other jobs when the reservation was made but those eventually finished and the reducer asking for 8G would have been scheduled. I thinks its a special case of don't reserve a container on a node if node memory - AM memory < requested memory that container (where the requested container is in the same job as AM). The AM is never going to finish before the requested container so the container will never get scheduled on that node. > CapacityScheduler reserves container on same node as AM but can't ever use > due to never enough avail memory > ----------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-3483 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3483 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 > Affects Versions: 0.23.0 > Reporter: Thomas Graves > Priority: Blocker > > Saw a case where a job was stuck trying to get reducers. The issue is the > capacity scheduler reserved a container on the same node as the application > master but there wasn't ever enough memory to run the reducer on that node. > Node total memory was 8G, Reducer needed 8G, AM was using 2G. This > particular job had 10 reducers and it was stuck waiting on the one because > the AM + reserved reducer memory was already over the queue limit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira