[ 
https://issues.apache.org/jira/browse/SLIDER-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744028#comment-14744028
 ] 

Steve Loughran commented on SLIDER-82:
--------------------------------------


Hi

I've looked at this, it's now in the git repo as 
{{feature/SLIDER-82_ANTI_AFFINITY_REQUIRED}}

I can see that it works, but not very well, and may not work if there is >1 
role trying to be placed.

The blacklist isn't per priority, it's for every role: you can't request >1 
role type at the same time. Specifically,
if I was trying to place role "hbase worker" and had blacklisted all but one 
node, a request for hbase master may pick up
that same blacklist, and not get placed.

Even if it was per-role, we can't stop >1 node being allocated on the same 
container.

What I do like is the brute force "reject all non-affine allocations" in 
onContainerAllocated(). It's inefficient, but, provided we are allocated
containers on different nodes, will succeed. 

Two problems with it
* there are no guaranteed in the scheduler that you don't get the same one 
back; that blacklisting
is essential.
* premption means that being given those containers that are then releases is 
very expensive: other
people's work is lost.

Which makes me realise that yes, you do need to use the blacklist —as your code 
does.


Anyway, I don't see that we can do it as is. Without YARN helping, the only way 
that
we can be sure things work is if we ask for exactly one instance of one role at 
a time.

The algorithm would be:

loop through each role

for each role with requests to make:
   blacklist all nodes which are either failed or hosting an instance already
   request exactly one new node
   wait for that allocation before continuing to ask for another instance or 
the next role
   
It would be slow, indeed, if container requests could not be satisfied for one 
role, then all other roles could block. But it would ensure that allocated 
nodes would be anti-affine.
   



> Support ANTI_AFFINITY_REQUIRED option
> -------------------------------------
>
>                 Key: SLIDER-82
>                 URL: https://issues.apache.org/jira/browse/SLIDER-82
>             Project: Slider
>          Issue Type: Task
>          Components: appmaster
>            Reporter: Steve Loughran
>             Fix For: Slider 2.0.0
>
>         Attachments: SLIDER-82.002.patch
>
>
> slider has an anti-affinity flag in roles (visible in resources.json?), which 
> is ignored.
> YARN-1042 promises this for YARN, slider will need
> # flag in resources.json
> # use in container requests
> we may also want two policies: anti-affinity-desired, and -required. Then if 
> required nodes get >1 container for the same component type on the same node, 
> it'd have to request a new one and return the old one (Risk: getting the same 
> one back). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to