subject:"\[jira\] \[Updated\] \(YARN\-3416\) deadlock in a job between map and reduce cores allocation"

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-04-01 Thread mai shurong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mai shurong updated YARN-3416:
--
Attachment: queue_with_max333cores.png
queue_with_max263cores.png
queue_with_max163cores.png

queue_with_max163cores.png : submit a job to a queue with max 163 cores
queue_with_max263cores.png : submit a job to a queue with max 263 cores
queue_with_max333cores.png : submit a job to a queue with max 333 cores

 deadlock in a job between map and reduce cores allocation 
 --

 Key: YARN-3416
 URL: https://issues.apache.org/jira/browse/YARN-3416
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: mai shurong
Priority: Critical
 Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, 
 queue_with_max163cores.png, queue_with_max263cores.png, 
 queue_with_max333cores.png


 I submit a  big job, which has 500 maps and 350 reduce, to a 
 queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
 running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
 And then, a map fails and retry, waiting for a core, while the 300 reduces 
 are waiting for failed map to finish. So a deadlock occur. As a result, the 
 job is blocked, and the later job in the queue cannot run because no 
 available cores in the queue.
 I think there is the similar issue for memory of a queue .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-04-01 Thread mai shurong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mai shurong updated YARN-3416:
--
Attachment: AM_log_head10.txt.gz
AM_log_tail10.txt.gz

head 10 lines and tail 10 lines of AM log of a deadlock job.

 deadlock in a job between map and reduce cores allocation 
 --

 Key: YARN-3416
 URL: https://issues.apache.org/jira/browse/YARN-3416
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: mai shurong
Priority: Critical
 Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz


 I submit a  big job, which has 500 maps and 350 reduce, to a 
 queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
 running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
 And then, a map fails and retry, waiting for a core, while the 300 reduces 
 are waiting for failed map to finish. So a deadlock occur. As a result, the 
 job is blocked, and the later job in the queue cannot run because no 
 available cores in the queue.
 I think there is the similar issue for memory of a queue .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-31 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-3416:
---
Priority: Critical  (was: Major)

 deadlock in a job between map and reduce cores allocation 
 --

 Key: YARN-3416
 URL: https://issues.apache.org/jira/browse/YARN-3416
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: mai shurong
Priority: Critical

 I submit a  big job, which has 500 maps and 350 reduce, to a 
 queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
 running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
 And then, a map fails and retry, waiting for a core, while the 300 reduces 
 are waiting for failed map to finish. So a deadlock occur. As a result, the 
 job is blocked, and the later job in the queue cannot run because no 
 available cores in the queue.
 I think there is the similar issue for memory of a queue .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-29 Thread mai shurong (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

mai shurong updated YARN-3416:
--
Description:
I submit a big job, which has 500 maps and 350 reduce, to a
queue(fairscheduler) with 300 max cores. When the big mapreduce job is running
100% maps, the 300 reduces have occupied 300 max cores in the queue. And then,
a map fails and retry, waiting for a core, while the 300 reduces are waiting
for failed map to finish. So a deadlock occur. As a result, the job is blocked,
and the later job in the queue cannot run because no available cores in the
queue.
I think there is the similar issue for memory of a queue .

was:
I submit a big job, which has 500 maps and 350 reduce, to a
queue(fairscheduler) with 300 max cores. When the big mapreduce job is running
100% maps, the 300 reduces have occupied 300 max cores in the queue. And then,
a map fails and retry, waiting for a core, while the 300 reduces are waiting
for failed map to finish. So a deadlock occur. As a result, the job is blocked,
and the later job in the queue cannot run because no available cores in the
queue.

deadlock in a job between map and reduce cores allocation
--

Key: YARN-3416
URL: https://issues.apache.org/jira/browse/YARN-3416
Project: Hadoop YARN
Issue Type: Bug
Components: fairscheduler
Affects Versions: 2.6.0
Reporter: mai shurong

I submit a big job, which has 500 maps and 350 reduce, to a
queue(fairscheduler) with 300 max cores. When the big mapreduce job is
running 100% maps, the 300 reduces have occupied 300 max cores in the queue.
And then, a map fails and retry, waiting for a core, while the 300 reduces
are waiting for failed map to finish. So a deadlock occur. As a result, the
job is blocked, and the later job in the queue cannot run because no
available cores in the queue.
I think there is the similar issue for memory of a queue .

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-29 Thread mai shurong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mai shurong updated YARN-3416:
--
Description: 
I submit a  big job, which has 500 maps and 350 reduce, to a 
queue(fairscheduler) with 300 max cores. When the big mapreduce job is running 
100% maps, the 300 reduces have occupied 300 max cores in the queue. And then, 
a map fails and retry, waiting for a core, while the 300 reduces are waiting 
for failed map to finish. So a deadlock accur, the job is blocked, and the 
later job in the queue cannot run because no available cores in the queue.


 deadlock in a job between map and reduce cores allocation 
 --

 Key: YARN-3416
 URL: https://issues.apache.org/jira/browse/YARN-3416
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: mai shurong

 I submit a  big job, which has 500 maps and 350 reduce, to a 
 queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
 running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
 And then, a map fails and retry, waiting for a core, while the 300 reduces 
 are waiting for failed map to finish. So a deadlock accur, the job is 
 blocked, and the later job in the queue cannot run because no available cores 
 in the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

2015-03-29 Thread mai shurong (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

was:
I submit a big job, which has 500 maps and 350 reduce, to a
queue(fairscheduler) with 300 max cores. When the big mapreduce job is running
100% maps, the 300 reduces have occupied 300 max cores in the queue. And then,
a map fails and retry, waiting for a core, while the 300 reduces are waiting
for failed map to finish. So a deadlock accur, the job is blocked, and the
later job in the queue cannot run because no available cores in the queue.

deadlock in a job between map and reduce cores allocation
--

Key: YARN-3416
URL: https://issues.apache.org/jira/browse/YARN-3416
Project: Hadoop YARN
Issue Type: Bug
Components: fairscheduler
Affects Versions: 2.6.0
Reporter: mai shurong

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

[jira] [Updated] (YARN-3416) deadlock in a job between map and reduce cores allocation

6 matches

Site Navigation

Mail list logo

Footer information