[ 
https://issues.apache.org/jira/browse/MESOS-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606278#comment-16606278
 ] 

Greg Mann commented on MESOS-8933:
----------------------------------

In my mind, the ideal solution is for frameworks to be updated to understand 
unavailability when included in offers. However, I understand that it can be 
difficult to get changes upstreamed into multiple frameworks.

Adding a new config flag is a possibility; one issue I see is that operators 
may want to configure this on a per-framework basis, rather than blocking such 
offers from all frameworks globally.

[~kaysoky] do you have any thoughts on this ticket?

> Stop sending offers from agents in draining mode
> ------------------------------------------------
>
>                 Key: MESOS-8933
>                 URL: https://issues.apache.org/jira/browse/MESOS-8933
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Sagar Sadashiv Patwardhan
>            Priority: Minor
>
> *Background:*
> At Yelp, we use mesos to run microservices(marathon), batch jobs(chronos and 
> custom frameworks), spark(spark mesos framework) etc.  We also autoscale the 
> number of agents in our cluster based on the current demand and some other 
> metrics. We use mesos maintenance primitives to gracefully shut down mesos 
> agents. 
> *Problem:*
> When we want to shut down an agent for some reason, we first move the agent 
> into draining mode. This allows us to gracefully terminate the micro-services 
> and other tasks. But, mesos continues to send offers from that agent with 
> unavailability set. Frameworks such as marathon, chronos, and spark ignore 
> the unavailability and schedule the tasks on the agent. To prevent this from 
> happening, we allocate all the available resources on that agent to a role 
> that is not used by any framework. But, this approach is not fool-proof. 
> There is still a race condition between when we move the agent into draining 
> mode and when we allocate all the available resources on the agent to 
> maintenance role.
> *Proposal:*
>  It would be nice if mesos stops sending offers from the agents in draining 
> mode. Something like this: 
> [https://gist.github.com/sagar8192/0b9dbccc908818f8f9f5a18d1f634513] I don't 
> know if this affects the allocator or not. We can put this behind a 
> flag(something like --do-not-send-offers-from-agents-in-draining-mode) and 
> make it optional.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to