[jira] [Updated] (YARN-2526) Scheduler Load Simulator may enter deadlock if lots of applications submitted to the RM at the same time

2014-09-09 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2526:
--
Attachment: (was: YARN-2526-1.patch)

> Scheduler Load Simulator may enter deadlock if lots of applications submitted 
> to the RM at the same time
> 
>
> Key: YARN-2526
> URL: https://issues.apache.org/jira/browse/YARN-2526
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2526-1.patch
>
>
> The simulation may enter deadlock if all application simulators hold all 
> threads provided by the thread pool, and all wait for AM container 
> allocation. In that case, all AM simulators wait for NM simulators to do 
> heartbeat to allocate resource, and all NM simulators wait for AM simulators 
> to release some threads. The simulator is deadlocked.
> To solve this deadlock, need to remove the while() loop in the MRAMSimulator.
> {code}
> // waiting until the AM container is allocated
> while (true) {
>   if (response != null && ! response.getAllocatedContainers().isEmpty()) {
> // get AM container
> .
> break;
>   }
>   // this sleep time is different from HeartBeat
>   Thread.sleep(1000);
>   // send out empty request
>   sendContainerRequest();
>   response = responseQueue.take();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2526) Scheduler Load Simulator may enter deadlock if lots of applications submitted to the RM at the same time

2014-09-09 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2526:
--
Attachment: YARN-2526-1.patch

> Scheduler Load Simulator may enter deadlock if lots of applications submitted 
> to the RM at the same time
> 
>
> Key: YARN-2526
> URL: https://issues.apache.org/jira/browse/YARN-2526
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2526-1.patch, YARN-2526-1.patch
>
>
> The simulation may enter deadlock if all application simulators hold all 
> threads provided by the thread pool, and all wait for AM container 
> allocation. In that case, all AM simulators wait for NM simulators to do 
> heartbeat to allocate resource, and all NM simulators wait for AM simulators 
> to release some threads. The simulator is deadlocked.
> To solve this deadlock, need to remove the while() loop in the MRAMSimulator.
> {code}
> // waiting until the AM container is allocated
> while (true) {
>   if (response != null && ! response.getAllocatedContainers().isEmpty()) {
> // get AM container
> .
> break;
>   }
>   // this sleep time is different from HeartBeat
>   Thread.sleep(1000);
>   // send out empty request
>   sendContainerRequest();
>   response = responseQueue.take();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2526) Scheduler Load Simulator may enter deadlock if lots of applications submitted to the RM at the same time

2014-09-09 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2526:
--
Attachment: YARN-2526-1.patch

> Scheduler Load Simulator may enter deadlock if lots of applications submitted 
> to the RM at the same time
> 
>
> Key: YARN-2526
> URL: https://issues.apache.org/jira/browse/YARN-2526
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2526-1.patch
>
>
> The simulation may enter deadlock if all application simulators hold all 
> threads provided by the thread pool, and all wait for AM container 
> allocation. In that case, all AM simulators wait for NM simulators to do 
> heartbeat to allocate resource, and all NM simulators wait for AM simulators 
> to release some threads. The simulator is deadlocked.
> To solve this deadlock, need to remove the while() loop in the MRAMSimulator.
> {code}
> // waiting until the AM container is allocated
> while (true) {
>   if (response != null && ! response.getAllocatedContainers().isEmpty()) {
> // get AM container
> .
> break;
>   }
>   // this sleep time is different from HeartBeat
>   Thread.sleep(1000);
>   // send out empty request
>   sendContainerRequest();
>   response = responseQueue.take();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2526) Scheduler Load Simulator may enter deadlock if lots of applications submitted to the RM at the same time

2014-09-09 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2526:
--
Priority: Minor  (was: Major)

> Scheduler Load Simulator may enter deadlock if lots of applications submitted 
> to the RM at the same time
> 
>
> Key: YARN-2526
> URL: https://issues.apache.org/jira/browse/YARN-2526
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2526-1.patch
>
>
> The simulation may enter deadlock if all application simulators hold all 
> threads provided by the thread pool, and all wait for AM container 
> allocation. In that case, all AM simulators wait for NM simulators to do 
> heartbeat to allocate resource, and all NM simulators wait for AM simulators 
> to release some threads. The simulator is deadlocked.
> To solve this deadlock, need to remove the while() loop in the MRAMSimulator.
> {code}
> // waiting until the AM container is allocated
> while (true) {
>   if (response != null && ! response.getAllocatedContainers().isEmpty()) {
> // get AM container
> .
> break;
>   }
>   // this sleep time is different from HeartBeat
>   Thread.sleep(1000);
>   // send out empty request
>   sendContainerRequest();
>   response = responseQueue.take();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)