[ 
https://issues.apache.org/jira/browse/SLIDER-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15856345#comment-15856345
 ] 

Billie Rinaldi commented on SLIDER-1199:
----------------------------------------

[~gsaha], thanks for the review. I was thinking that this block should be 
synchronized with 
[SliderAppMaster#executeNodeReview|https://github.com/apache/incubator-slider/blob/develop/slider-core/src/main/java/org/apache/slider/server/appmaster/SliderAppMaster.java#L1948-L1966].
 That method calls appState.reviewRequestAndReleaseNodes (which is calling 
appState.updateBlacklist), and then executes all of the operations returned, 
which may include a blacklist operation. I think "creating a blacklist 
operation and executing it" should be synchronized.

Blacklist is a YARN concept. updateBlacklist, blacklistAdditions, and 
blacklistRemovals are taken directly from the YARN API. I think we should 
continue to use the same terminology that YARN does.

> Blacklist nodes that exceed the node failure threshold for a role
> -----------------------------------------------------------------
>
>                 Key: SLIDER-1199
>                 URL: https://issues.apache.org/jira/browse/SLIDER-1199
>             Project: Slider
>          Issue Type: Bug
>          Components: appmaster
>            Reporter: Billie Rinaldi
>            Assignee: Billie Rinaldi
>             Fix For: Slider 1.0.0
>
>         Attachments: SLIDER-1199.1.patch, SLIDER-1199.2.patch, 
> SLIDER-1199.3.patch, SLIDER-1199.4.patch
>
>
> From the code, it seems like when the node failure threshold for a role is 
> exceeded, that node is no longer suggested for placement. But there is 
> nothing preventing the RM from selecting the node again. If the node were 
> blacklisted, perhaps that would prevent new allocations on problem nodes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to