[ 
https://issues.apache.org/jira/browse/YARN-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2487:
------------------------------------
    Description: 
 There are some scenarios where AM will not get containers and indefinitely 
waiting. We faced one such sceanrio which makes the applications to get hung : 
Consider a cluster setup which has 2 NMS of each 8GB resource,
And 2 applications(MR2) are launched in the default queue where in each AM is 
taking 2 GB each.
Each AM is placed in each of the NM. Now each AM is requesting for container of 
7Gb  mem resource .
As in each NM only 6GB resource is available both the applications are hung 
forever.

To avoid such scenarios i would like to propose 
generic timeout feature for all AM's in yarn, such that if no containers are 
assigned for an application for a defined period than yarn can timeout the 
application attempt.
Default can be set to 0 where in RM will not timeout the app attempt and user 
can set his own timeout when he submits the application

  was:
 There are some scenarios where AM will not get containers and indefinetely 
waiting. We faced one such sceanrio which makes the applications to get hung : 
Consider a cluster setup which has 2 NMS of each 8GB resource,
And 2 applications are launched in the default queue where in each AM is taking 
2 GB each.
Each AM is placed in each of the NM. Now each AM is requesting for container of 
7Gb  mem resource .
As in each NM only 6GB resource is available both the applications are hung 
forever.

To avoid such scenarios i would to propose 
generic timeout feature for all AM's @ the yarn side such that if no containers 
are assigned for an application for a defined period than yarn can timeout the 
application attempt.
Default can be set to 0 where in RM will not timeout the app attempt and user 
can set his own timeout when he submits the application


> Need to support timeout of AM When no containers are assigned to it for a 
> defined period
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-2487
>                 URL: https://issues.apache.org/jira/browse/YARN-2487
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>
>  There are some scenarios where AM will not get containers and indefinitely 
> waiting. We faced one such sceanrio which makes the applications to get hung 
> : 
> Consider a cluster setup which has 2 NMS of each 8GB resource,
> And 2 applications(MR2) are launched in the default queue where in each AM is 
> taking 2 GB each.
> Each AM is placed in each of the NM. Now each AM is requesting for container 
> of 7Gb  mem resource .
> As in each NM only 6GB resource is available both the applications are hung 
> forever.
> To avoid such scenarios i would like to propose 
> generic timeout feature for all AM's in yarn, such that if no containers are 
> assigned for an application for a defined period than yarn can timeout the 
> application attempt.
> Default can be set to 0 where in RM will not timeout the app attempt and user 
> can set his own timeout when he submits the application



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to