Cluster Maintanence

John Omernik Thu, 29 Oct 2015 11:21:18 -0700

I am wondering if there are some easy ways to take a healthy slave/agent
and start a process to bleed processes out.


Basically, without having to do something where every framework would
support it, I'd like the option to

1. Stop offering resources to new frameworks. I.e. no new resources would
be offered, but existing jobs/tasks continue to run.
2.  Offer the ability, especially in the UI, but potentially in API as well
to "kill" a task.  This would cause a failure that force the framework to
respond. For example, if it was a docker container running in marathon, if
I said "please kill this task" it would, marathon would recognize the
failure and try to restart the container. Since our agent (in point 1) is
not offering resources, then that task would not fall on the agent in
question.


The reason for this manual bleeding is to say run updates on a node or pull
it out of service for other reasons (memory upgrades etc) and do so in a
manual way.  You may want to address what's running on the node manually,
thus a whole scale "kill everything" while it SHOULD be doable, may not
always be feasible. In addition, the inverse offers thing seems neat, but
frameworks have to support it.

So, is there any thing like that now and I am just missing it in the
documentation?  I am curious to hear how others are handling this situation
in their environments.

John

Cluster Maintanence

Reply via email to