[jira] [Commented] (YARN-2266) Add an application timeout service in RM to kill applications which are not getting resources
[ https://issues.apache.org/jira/browse/YARN-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142399#comment-15142399 ] Sudip Hazra Choudhury commented on YARN-2266: - Surely, we are interested in this feature. It would be very helpful. This feature can have a default value of 0 (infinite) and others should be able to set non-zero value depending on the requirement. > Add an application timeout service in RM to kill applications which are not > getting resources > - > > Key: YARN-2266 > URL: https://issues.apache.org/jira/browse/YARN-2266 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Ashutosh Jindal > > Currently , If an application is submitted to RM, the app keeps waiting until > the resources are allocated for AM. Such an application may be stuck till a > resource is allocated for AM, and this may be due to over utilization of > Queue or User limits etc. In a production cluster, some periodic running > applications may have lesser cluster share. So after waiting for some time, > if resources are not available, such applications can be made as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2266) Add an application timeout service in RM to kill applications which are not getting resources
[ https://issues.apache.org/jira/browse/YARN-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143985#comment-15143985 ] Rohith Sharma K S commented on YARN-2266: - Apologies for not observing this JIRA before creating YARN-3813. Both the JIRA's are intended with same use case. There are some progress in YARN-3813 along with POC patch, so we can continue discussion in YARN-3813. > Add an application timeout service in RM to kill applications which are not > getting resources > - > > Key: YARN-2266 > URL: https://issues.apache.org/jira/browse/YARN-2266 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Ashutosh Jindal > > Currently , If an application is submitted to RM, the app keeps waiting until > the resources are allocated for AM. Such an application may be stuck till a > resource is allocated for AM, and this may be due to over utilization of > Queue or User limits etc. In a production cluster, some periodic running > applications may have lesser cluster share. So after waiting for some time, > if resources are not available, such applications can be made as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2266) Add an application timeout service in RM to kill applications which are not getting resources
[ https://issues.apache.org/jira/browse/YARN-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143169#comment-15143169 ] Devaraj K commented on YARN-2266: - Duplicate of YARN-3813 > Add an application timeout service in RM to kill applications which are not > getting resources > - > > Key: YARN-2266 > URL: https://issues.apache.org/jira/browse/YARN-2266 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Ashutosh Jindal > > Currently , If an application is submitted to RM, the app keeps waiting until > the resources are allocated for AM. Such an application may be stuck till a > resource is allocated for AM, and this may be due to over utilization of > Queue or User limits etc. In a production cluster, some periodic running > applications may have lesser cluster share. So after waiting for some time, > if resources are not available, such applications can be made as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2266) Add an application timeout service in RM to kill applications which are not getting resources
[ https://issues.apache.org/jira/browse/YARN-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524153#comment-14524153 ] Zhijie Shen commented on YARN-2266: --- Are we still interested in this enhancement? Otherwise, we can close this jira as won't fix. Add an application timeout service in RM to kill applications which are not getting resources - Key: YARN-2266 URL: https://issues.apache.org/jira/browse/YARN-2266 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Ashutosh Jindal Currently , If an application is submitted to RM, the app keeps waiting until the resources are allocated for AM. Such an application may be stuck till a resource is allocated for AM, and this may be due to over utilization of Queue or User limits etc. In a production cluster, some periodic running applications may have lesser cluster share. So after waiting for some time, if resources are not available, such applications can be made as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2266) Add an application timeout service in RM to kill applications which are not getting resources
[ https://issues.apache.org/jira/browse/YARN-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056459#comment-14056459 ] Vinod Kumar Vavilapalli commented on YARN-2266: --- bq. So after waiting for some time, if resources are not available, such applications can be made as failed. What happens next? The apps are going to be resubmitted and they will still wait in the queue. Trying to understand the overall picture.. It seems like you want to reserve some capacity for a queue of periodically running applications to avoid that from happening in the first place.. Add an application timeout service in RM to kill applications which are not getting resources - Key: YARN-2266 URL: https://issues.apache.org/jira/browse/YARN-2266 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Ashutosh Jindal Currently , If an application is submitted to RM, the app keeps waiting until the resources are allocated for AM. Such an application may be stuck till a resource is allocated for AM, and this may be due to over utilization of Queue or User limits etc. In a production cluster, some periodic running applications may have lesser cluster share. So after waiting for some time, if resources are not available, such applications can be made as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2266) Add an application timeout service in RM to kill applications which are not getting resources
[ https://issues.apache.org/jira/browse/YARN-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057127#comment-14057127 ] Ashutosh Jindal commented on YARN-2266: --- bq.What happens next? The apps are going to be resubmitted and they will still wait in the queue. No, the same application will not be submitted again. Consider a case where an application run periodically every hour and the average time for the app completion is 30 mins. In such case, if the application is not getting resources for 30 mins or say it gets the resources after 30 mins, it is better to kill the application and let the next application serve the purpose. Add an application timeout service in RM to kill applications which are not getting resources - Key: YARN-2266 URL: https://issues.apache.org/jira/browse/YARN-2266 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Ashutosh Jindal Currently , If an application is submitted to RM, the app keeps waiting until the resources are allocated for AM. Such an application may be stuck till a resource is allocated for AM, and this may be due to over utilization of Queue or User limits etc. In a production cluster, some periodic running applications may have lesser cluster share. So after waiting for some time, if resources are not available, such applications can be made as failed. -- This message was sent by Atlassian JIRA (v6.2#6252)