[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032476#comment-14032476 ]
Niels Basjes commented on MAPREDUCE-5928: ----------------------------------------- I'm not the only one who ran into this: http://hortonworks.com/community/forums/topic/mapreduce-race-condition-big-job/ > Deadlock allocating containers for mappers and reducers > ------------------------------------------------------- > > Key: MAPREDUCE-5928 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 > Project: Hadoop Map/Reduce > Issue Type: Bug > Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) > Reporter: Niels Basjes > Attachments: Cluster fully loaded.png.jpg, MR job stuck in > deadlock.png.jpg > > > I have a small cluster consisting of 8 desktop class systems (1 master + 7 > workers). > Due to the small memory of these systems I configured yarn as follows: > {quote} > yarn.nodemanager.resource.memory-mb = 2200 > yarn.scheduler.minimum-allocation-mb = 250 > {quote} > On my client I did > {quote} > mapreduce.map.memory.mb = 512 > mapreduce.reduce.memory.mb = 512 > {quote} > Now I run a job with 27 mappers and 32 reducers. > After a while I saw this deadlock occur: > - All nodes had been filled to their maximum capacity with reducers. > - 1 Mapper was waiting for a container slot to start in. > I tried killing reducer attempts but that didn't help (new reducer attempts > simply took the existing container). > *Workaround*: > I set this value from my job. The default value is 0.05 (= 5%) > {quote} > mapreduce.job.reduce.slowstart.completedmaps = 0.99f > {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)