[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-04-01 Thread mai shurong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14390466#comment-14390466 ] mai shurong commented on YARN-3416: --- I found a new case today. I submitted a more larger

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-31 Thread Rohith (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1438#comment-1438 ] Rohith commented on YARN-3416: -- bq. there are only 4 NodeManagers in cluster, so it is possibl

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-30 Thread mai shurong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387838#comment-14387838 ] mai shurong commented on YARN-3416: --- mapreduce.job.reduce.slowstart.completedmaps is 0.5

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-30 Thread Ray Chiang (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386992#comment-14386992 ] Ray Chiang commented on YARN-3416: -- There probably is a bug here, but what value do you ha

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-30 Thread mai shurong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386300#comment-14386300 ] mai shurong commented on YARN-3416: --- The AM logs are repeately printed such as follows:

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-30 Thread mai shurong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386301#comment-14386301 ] mai shurong commented on YARN-3416: --- The AM logs are repeately printed such as follows:

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-29 Thread mai shurong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386258#comment-14386258 ] mai shurong commented on YARN-3416: --- The job ran two days ago, now the AM logs is deleted

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-29 Thread mai shurong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386244#comment-14386244 ] mai shurong commented on YARN-3416: --- In YARN-1680,there are only 4 NodeManagers in cluste

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-29 Thread Rohith (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386187#comment-14386187 ] Rohith commented on YARN-3416: -- I suspect this scenario would be same as YARN-1680. Would you

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-29 Thread mai shurong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386186#comment-14386186 ] mai shurong commented on YARN-3416: --- There are methods to work around: temporarily increa

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-29 Thread Rohith (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386182#comment-14386182 ] Rohith commented on YARN-3416: -- bq. And then, a map fails and retry, waiting for a core, while

[jira] [Commented] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-29 Thread Rohith (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386179#comment-14386179 ] Rohith commented on YARN-3416: -- Would you mind attaching AM logs? > deadlock in a job between