[jira] [Updated] (YARN-2456) Possible livelock in CapacityScheduler when RM is recovering apps

2014-09-12 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2456:
--
Attachment: YARN-2456.2.patch

patch rebased

> Possible livelock in CapacityScheduler when RM is recovering apps
> -
>
> Key: YARN-2456
> URL: https://issues.apache.org/jira/browse/YARN-2456
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-2456.1.patch, YARN-2456.2.patch
>
>
> Consider this scenario:
> 1. RM is configured with a single queue and only one application can be 
> active at a time.
> 2. Submit App1 which uses up the queue's whole capacity
> 3. Submit App2 which remains pending.
> 4. Restart RM.
> 5. App2 is recovered before App1, so App2 is added to the activeApplications 
> list. Now App1 remains pending (because of max-active-app limit)
> 6. All containers of App1 are now recovered when NM registers, and use up the 
> whole queue capacity again.
> 7. Since the queue is full, App2 cannot proceed to allocate AM container.
> 8. In the meanwhile, App1 cannot proceed to become active because of the 
> max-active-app limit 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2456) Possible livelock in CapacityScheduler when RM is recovering apps

2014-09-10 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2456:
--
Summary: Possible livelock in CapacityScheduler when RM is recovering apps  
(was: Possible lovelock in CapacityScheduler when RM is recovering apps)

> Possible livelock in CapacityScheduler when RM is recovering apps
> -
>
> Key: YARN-2456
> URL: https://issues.apache.org/jira/browse/YARN-2456
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-2456.1.patch
>
>
> Consider this scenario:
> 1. RM is configured with a single queue and only one application can be 
> active at a time.
> 2. Submit App1 which uses up the queue's whole capacity
> 3. Submit App2 which remains pending.
> 4. Restart RM.
> 5. App2 is recovered before App1, so App2 is added to the activeApplications 
> list. Now App1 remains pending (because of max-active-app limit)
> 6. All containers of App1 are now recovered when NM registers, and use up the 
> whole queue capacity again.
> 7. Since the queue is full, App2 cannot proceed to allocate AM container.
> 8. In the meanwhile, App1 cannot proceed to become active because of the 
> max-active-app limit 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)