[
https://issues.apache.org/jira/browse/SLIDER-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran resolved SLIDER-758.
-----------------------------------
Resolution: Duplicate
duplicate of SLIDER-743
> Slider placement requests to skip unreliable nodes
> --------------------------------------------------
>
> Key: SLIDER-758
> URL: https://issues.apache.org/jira/browse/SLIDER-758
> Project: Slider
> Issue Type: Improvement
> Components: appmaster
> Affects Versions: Slider 0.60
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Fix For: Slider 0.70
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> As discussed on the developer list; slider's "prefer previously used nodes"
> is biased towards recently used nodes —even when those nodes are failing to
> successfully launch containers.
> As we already track node failure rates, the placement logic can be enhanced
> to not generate "placed" requests on nodes with a (recent) failure history
> of that component type.
> The initial iteration of this feature will not use the YARN blacklisting
> APIs, instead build up history in the AM, history that will be lost on AM
> restart. Accordingly, even unplaced requests may end being scheduled on the
> unreliable nodes.
> This strategy (which we could revisit in future), combined with a regular
> reset of the failure counters, stops slider blacklisting nodes whose failure
> rate was high some time previously —but which is now reliable again.
> Testing: primarily via mocking
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)