[
https://issues.apache.org/jira/browse/SLIDER-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gour Saha updated SLIDER-930:
-----------------------------
Description:
YARN-611 provides this feature. Currently Slider apps are bound by the number
set for yarn.resourcemanager.am.max-retries in the cluster. By default this
value is set to 2, which is very low for long running services.
Slider AM should use the feature provided in YARN-611 and set an interval after
which the failure count will be reset to 0.
I believe the API to call on ApplicationSubmissionContext is
attemptFailuresValidityInterval. To start with Slider can set it to 5 mins
which should be a reasonable default.
was:
YARN-611 provides this feature. Currently Slider apps are bound by the number
set for yarn.resourcemanager.am.max-retries in the cluster. By default this
value is set to 2, which is very low for long running services.
Slider AM should use the feature provided in YARN-611 and set a interval after
which the failure count will be reset to 0.
I believe the API to call on ApplicationSubmissionContext is
attemptFailuresValidityInterval. To start with Slider can set it to 5 mins
which should be a reasonable default.
> Incorporate Yarn feature of resetting AM failure count into Slider AM
> ---------------------------------------------------------------------
>
> Key: SLIDER-930
> URL: https://issues.apache.org/jira/browse/SLIDER-930
> Project: Slider
> Issue Type: Bug
> Components: appmaster
> Affects Versions: Slider 0.80
> Reporter: Gour Saha
> Assignee: thomas liu
> Fix For: Slider 0.81
>
>
> YARN-611 provides this feature. Currently Slider apps are bound by the number
> set for yarn.resourcemanager.am.max-retries in the cluster. By default this
> value is set to 2, which is very low for long running services.
> Slider AM should use the feature provided in YARN-611 and set an interval
> after which the failure count will be reset to 0.
> I believe the API to call on ApplicationSubmissionContext is
> attemptFailuresValidityInterval. To start with Slider can set it to 5 mins
> which should be a reasonable default.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)