[ 
https://issues.apache.org/jira/browse/SLIDER-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15855451#comment-15855451
 ] 

Gour Saha commented on SLIDER-1199:
-----------------------------------

[~billie.rinaldi] the .4 patch looks good.

Few comments:
1.
_updateBlacklist_ in AppState.java is synchronized. Does the following block in 
_ResetFailureWindow.java_ needs to be synchronized on _appMaster_?

{code}
    synchronized (appMaster) {
      appState.resetFailureCounts();
      AbstractRMOperation blacklistOperation = appState.updateBlacklist();
      if (blacklistOperation != null) {
        blacklistOperation.execute(operationHandler);
      }
    }
{code}

2.
All throughout this patch we are referring to updateBlacklist, 
blacklistOperation, blacklistAdditions, blacklistRemovals, etc. All these at 
this point are referring to blacklist of nodes. Do you think explicitly 
referring them as updateNodesBlacklist, nodesBlacklistOperation, 
nodesBlacklistAdditions, etc. makes more sense? Blacklist is pretty generic and 
could be applicable to say containers, applications, etc. down the line. What 
do you think?

> Blacklist nodes that exceed the node failure threshold for a role
> -----------------------------------------------------------------
>
>                 Key: SLIDER-1199
>                 URL: https://issues.apache.org/jira/browse/SLIDER-1199
>             Project: Slider
>          Issue Type: Bug
>          Components: appmaster
>            Reporter: Billie Rinaldi
>            Assignee: Billie Rinaldi
>             Fix For: Slider 1.0.0
>
>         Attachments: SLIDER-1199.1.patch, SLIDER-1199.2.patch, 
> SLIDER-1199.3.patch, SLIDER-1199.4.patch
>
>
> From the code, it seems like when the node failure threshold for a role is 
> exceeded, that node is no longer suggested for placement. But there is 
> nothing preventing the RM from selecting the node again. If the node were 
> blacklisted, perhaps that would prevent new allocations on problem nodes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to