[ 
https://issues.apache.org/jira/browse/YARN-10357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210612#comment-17210612
 ] 

Tanu Ajmera commented on YARN-10357:
------------------------------------

There can be three ways to do this -

1. During NM decommission, NM calls RM informing it is going to be 
decommissioned so that RM has the information and releases the containers 
before 10 mins timeout.
2. Third party who decommissions NM can inform RM about the decommissioning and 
then RM releases the containers
3. Reduce the timeout to 3 mins to save time.      

cc [~wangda] [~sunil.gov...@gmail.com]

> Proactively relocate allocated containers from a stopped node
> -------------------------------------------------------------
>
>                 Key: YARN-10357
>                 URL: https://issues.apache.org/jira/browse/YARN-10357
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler, multi-node-placement
>    Affects Versions: 3.4.0
>            Reporter: Prabhu Joseph
>            Assignee: Tanu Ajmera
>            Priority: Major
>
> In a cloud environment, node can be frequently commissioned, if we always 
> wait for 10 mins timeout, it may not be good, it's better to improve the 
> logic by preempting containers newly allocated (by not acquired) on NM which 
> stopped heartbeating. With this, we can proactively relocate containers to 
> different nodes before the 10 mins timeout.
> cc [~leftnoteasy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to