[jira] [Updated] (YARN-2456) Possible livelock in CapacityScheduler when RM is recovering apps

2014-09-12 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2456:
--
Attachment: YARN-2456.2.patch

patch rebased

 Possible livelock in CapacityScheduler when RM is recovering apps
 -

 Key: YARN-2456
 URL: https://issues.apache.org/jira/browse/YARN-2456
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-2456.1.patch, YARN-2456.2.patch


 Consider this scenario:
 1. RM is configured with a single queue and only one application can be 
 active at a time.
 2. Submit App1 which uses up the queue's whole capacity
 3. Submit App2 which remains pending.
 4. Restart RM.
 5. App2 is recovered before App1, so App2 is added to the activeApplications 
 list. Now App1 remains pending (because of max-active-app limit)
 6. All containers of App1 are now recovered when NM registers, and use up the 
 whole queue capacity again.
 7. Since the queue is full, App2 cannot proceed to allocate AM container.
 8. In the meanwhile, App1 cannot proceed to become active because of the 
 max-active-app limit 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2456) Possible livelock in CapacityScheduler when RM is recovering apps

2014-09-10 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2456:
--
Summary: Possible livelock in CapacityScheduler when RM is recovering apps  
(was: Possible lovelock in CapacityScheduler when RM is recovering apps)

 Possible livelock in CapacityScheduler when RM is recovering apps
 -

 Key: YARN-2456
 URL: https://issues.apache.org/jira/browse/YARN-2456
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-2456.1.patch


 Consider this scenario:
 1. RM is configured with a single queue and only one application can be 
 active at a time.
 2. Submit App1 which uses up the queue's whole capacity
 3. Submit App2 which remains pending.
 4. Restart RM.
 5. App2 is recovered before App1, so App2 is added to the activeApplications 
 list. Now App1 remains pending (because of max-active-app limit)
 6. All containers of App1 are now recovered when NM registers, and use up the 
 whole queue capacity again.
 7. Since the queue is full, App2 cannot proceed to allocate AM container.
 8. In the meanwhile, App1 cannot proceed to become active because of the 
 max-active-app limit 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)