[jira] [Updated] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated MAPREDUCE-5928: Attachment: MR job stuck in deadlock.png.jpg Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated MAPREDUCE-5928: Attachment: Cluster fully loaded.png.jpg NOTE: Node2 had issues so the system took it offline (0 containers). Perhaps this is what confused the MapReduce application? Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated MAPREDUCE-5928: Attachment: AM-MR-syslog - Cleaned.txt.gz I downloaded the Application Master log and attached it to this issue. (I changed the domainname of the nodes) Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)