Hi,

I am currently working on designing an auto-scaling solution for Mesos
slaves in AWS and would love to get some feedback around that. There are a
couple of ways for doing it, and I was thinking to start with simple cases
first -

a. Define the lowest resource offer a framework can afford to get and then
we start using the information published by Mesos master in states.json to
determine if the cluster has enough resources. If we see that the available
resources won't satisfy the lower bounds set, we bring up new EC2 instances
with enough resources that Mesos could use to make offers.

b. Latency for getting an offer for a given job. Say that the framework has
a job which needs x cpu, y memory and y ports. If the framework doesn't get
an offer until t amount of time, the ASG with slaves of EC2 instance type
which can offer that amount of resource is autoscaled.

c. Maintain historical information about the resources used, jobs submitted
and running in Mesos and use that information for doing Predictive
autoscaling.

I would like to understand if potentially there are better ways of
achieving elasticity in a Mesos cluster and where the complexity lies,
information that Mesos could provide us to make it more efficient.

-- 
Thanks,
Diptanu Choudhury
Web - www.linkedin.com/in/diptanu
Twitter - @diptanu <http://twitter.com/diptanu>

Reply via email to