I'm curious to learn what's been going on in Mesos (and the general
ecosystem) around
service scheduling. In particular, I'm curious about how Mesos might work
in a
cluster where service tasks are more common than batch tasks, e.g., a
cluster
with a single framework for running stateless tasks and many frameworks for
running stateful tasks.

I haven't been able to find much information about how exactly service
scheduling fits with Mesos -- the dialogue is certainly skewed towards
ephemeral / batch scheduling at the moment. With that in mind, I've tried to
outline some topics I've been thinking about recently. What I'm really
curious
to know is:

1. Am I way off track?
2. For a service scheduler built today, how much is Mesos responsible for
and
   how much the framework? What about going forward?
3. Are there already some patterns/idioms for these kinds of things in
existing
   frameworks?

# Balancing tasks within a framework

For this, imagine a framework that schedules long-lived (service), stateless
tasks.

- If asked to schedule a task with comparatively large resource
requirements,
  the task may never get scheduled if it waits for a sufficiently large
  resource offer. Instead, it should attempt to reschedule existing tasks to
  "make room" for it. How might that work?

- If asked to schedule multiple copies of a task across different machines,
  some copies may never get scheduled if it waits for a sufficiently diverse
  set of resource offers. Instead, it should reschedule existing tasks to
  meet the availability requirements of the task. What might that look like?

Maybe both of these could be accomplished by using some combination of:

- using `requestResources` when large tasks are requested to try and get
bigger
  offers.

- using saved offers to relaunch existing tasks, and then hoarding the freed
  resources for scheduling new tasks.

# Resource contention / balancing tasks across frameworks

For this, imagine there are two frameworks, one like above, running
stateless
service tasks, the other responsible for a single stateful task. Again, the
cluster is relatively full.

- If the stateful scheduler wants to run its task on a particular machine,
but
  that machine's resources are currently consumed by the other framework,
what
  happens?

- If the stateful scheduler can run its task on any machine, but there
exists
  no single offer sufficiently large to run the task, what does it do?

Some possible ways to approach this:

- The ability to request that other frameworks release their saved offers,
as
  the resources may actually be available, but currently hoarded. I think
  `requestResources` on the scheduler might do this?

- The ability to request that other frameworks reschedule existing tasks.
This
  could be a "user-land" feature? If I have a particular slave in mind to
run
  my task and there is a way to find frameworks with tasks on that slave, I
  could randomly send some kind of "reschedule" message to one of the
  frameworks. This message might include the slave, my requested resources,
and
  a priority understood by all of my frameworks. The other framework could
then
  compare its priority with the message, and decide whether it should
  reschedule.

Cheers,

Bernerd
Engineer @ SoundCloud

Reply via email to