[ 
https://issues.apache.org/jira/browse/YARN-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

MENG DING updated YARN-4080:
----------------------------
    Description: 
YARN-1197 addresses the functionality of container resource resize. One major 
use case of this feature is for long running services managed by Slider to 
dynamically flex up and down resource allocation of individual components 
(e.g., HBase region server), based on application metrics/alerts obtained 
through third-party monitoring and policy engine. 

One key issue with increasing container resource at any point of time is that 
the additional resource needed by the application component may not be 
available *on the specific node*. In this case, we need to rely on preemption 
logic to reclaim the required resource back from other (preemptable) 
applications running on the same node. But this may not be possible today 
because:
* preemption doesn't consider constraints of pending resource requests, such as 
hard locality requirements, user limits, etc (being addressed in YARN-2154 and 
possibly in YARN-3769?) 
* there may not be any preemptable container available due to the fact that no 
queue is over its guaranteed capacity.

What we need, ideally, is a way for YARN to support future capacity planning of 
long running services. At the minimum, we need to provide a way to let YARN 
know about the resource usage prediction/pattern of a long running service. And 
given this knowledge, YARN should be able to preempt resources from other 
applications to accommodate the resource needs of the long running service.

  was:
YARN-1197 addresses the functionality of container resource resize. One major 
use case of this feature is for long running services managed by Slider to 
dynamically flex up and down resource allocation of individual components 
(e.g., HBase region server), based on application metrics/alerts obtained 
through third-party monitoring and policy engine. 

One key issue with increasing container resource at any point of time is that 
the additional resource needed by the application component may not be 
available *on the specific node*. In this case, we need to rely on preemption 
logic to reclaim the required resource back from other (preemptable) 
applications running on the same node. But this may not be possible today 
because:
* preemption doesn't consider constraints of pending resource requests, such as 
hard locality requirements, user limits, etc (being addressed in YARN-2154 and 
possibly in YARN-3769?) 
* there may not be any preemptable container available due to the fact that no 
application is over its guaranteed capacity.

What we need, ideally, is a way for YARN to support future capacity planning of 
long running services. At the minimum, we need to provide a way to let YARN 
know about the resource usage prediction/pattern of a long running service. And 
given this knowledge, YARN should be able to preempt resources from other 
applications to accommodate the resource needs of the long running service.


> Capacity planning for long running services on YARN
> ---------------------------------------------------
>
>                 Key: YARN-4080
>                 URL: https://issues.apache.org/jira/browse/YARN-4080
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: api, resourcemanager
>            Reporter: MENG DING
>
> YARN-1197 addresses the functionality of container resource resize. One major 
> use case of this feature is for long running services managed by Slider to 
> dynamically flex up and down resource allocation of individual components 
> (e.g., HBase region server), based on application metrics/alerts obtained 
> through third-party monitoring and policy engine. 
> One key issue with increasing container resource at any point of time is that 
> the additional resource needed by the application component may not be 
> available *on the specific node*. In this case, we need to rely on preemption 
> logic to reclaim the required resource back from other (preemptable) 
> applications running on the same node. But this may not be possible today 
> because:
> * preemption doesn't consider constraints of pending resource requests, such 
> as hard locality requirements, user limits, etc (being addressed in YARN-2154 
> and possibly in YARN-3769?) 
> * there may not be any preemptable container available due to the fact that 
> no queue is over its guaranteed capacity.
> What we need, ideally, is a way for YARN to support future capacity planning 
> of long running services. At the minimum, we need to provide a way to let 
> YARN know about the resource usage prediction/pattern of a long running 
> service. And given this knowledge, YARN should be able to preempt resources 
> from other applications to accommodate the resource needs of the long running 
> service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to