Re: Autoscaling in an IaaS environment

Yong Feng Sun, 02 Aug 2015 17:35:59 -0700

I prefer an auto-scaler outside mesos as well. As long as Mesos exports
enough statistics, an outside auto-scaler should be able to make the
auto-scale decision as smart as Mesos itself. It will also help to douple
the resource scheduilng from resource infrastructure management. Mesos just
need focus on how to support adding/removing nodes dynanicly and gracefully
without impact running workload such as feature of host
maintenance/removing ....


Besides, exproting statistics also helps on Mesos
diagnosing/troubleshooting, simulation, profiling and so on.

The only case an auto-scaler may not support is that the auto-scale decison
may have impact on sceduling decison for exapmle resource mamanger (like
Mesos) don't have to reclaim a framework if new nodes with required
resources will be added. However we even could argue whether it is a valid
use case that we ask scheduling decison depends on auto-scale decsion.

Thanks


On Sun, Aug 2, 2015 at 12:56 PM, tommy xiao <xia...@gmail.com> wrote:

> my want:  write a daemon to query mesos framework api, get the statistics
> from mesos api. then invoke the IaaS's API to scale the cluster size.
>
> 2015-08-02 22:32 GMT+08:00 Alex Rukletsov <a...@mesosphere.com>:
>
> > I agree with Vinod that the Master accumulates a lot of statistics that
> can
> > be used for smarter decisions about cluster scaling. However, I'm not
> sure
> > this feature should reside in Mesos. I would rather expose statistics
> and /
> > or recommendations and let external tooling or an operator do the job.
> > On 31 Jul 2015 7:15 pm, "Vinod Kone" <vinodk...@gmail.com> wrote:
> >
> > > Thanks for pinging again Mathieu!
> > >
> > > I think auto-scaling of a Mesos cluster is a nifty feature to have. The
> > > only question in my mind (and likely others) is whether this
> > functionality
> > > should reside in Mesos, or a framework or an operator. As you
> mentioned,
> > > Netflix took the framework way but it doesn't necessarily work in a
> > > multi-framework environment. If the functionality lies with an operator
> > it
> > > has to be a library (likely a service) so that more people can take
> > > advantage of it.
> > >
> > > In my mind, it is not hard to imagine having this functionality in
> Mesos.
> > > Since Mesos is in the best position to know the (current and perhaps
> > > projected) state of the cluster it could make smart decisions about the
> > > shape and size of the new nodes that can be added. This also becomes
> > > interesting in the face of the quota
> > > <https://issues.apache.org/jira/browse/MESOS-1791> work that we are
> > > currently doing.
> > >
> > > Having said that, I think you can do this today by writing an allocator
> > > module. Note that Mesos already provides a requestResources() API call
> > > (similar to Wish in your ppt) that is passed to the allocator. You
> should
> > > be able to write an allocator module that takes this signal and talks
> to
> > > your favorite IaaS API to spin up new node(s) if necessary.
> > >
> > >
> > > On Fri, Jul 31, 2015 at 8:29 AM, Roger Ignazio <rigna...@gmail.com>
> > wrote:
> > >
> > > > With the number of IaaS providers out there, and the fact that Mesos
> > > > doesn't really concern itself with where it's running (IaaS,
> > bare-metal,
> > > > on-prem, in the cloud), this sounds more like an operations problem
> > than
> > > a
> > > > feature that should be in Mesos core.
> > > >
> > > > By any chance, have you had a chance to look at
> > > > https://github.com/thefactory/autoscale-python? I'd venture to guess
> > > that
> > > > project (or a homegrown solution talking to your IaaS' API), combined
> > > with
> > > > some custom AWS AMIs (or vSphere templates or OpenStack images or
> ...),
> > > > would satisfy your use-case.
> > > >
> > > > -- Roger
> > > >
> > > > On Fri, Jul 31, 2015 at 5:37 AM, VELTEN, MATHIEU <
> > > mathieu.vel...@atos.net>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I am currently working for some projects using Mesos at Atos
> Toulouse
> > > and
> > > > > we are using it on top of a classical IaaS.
> > > > >
> > > > > After playing with Mesos and looking at some code it appears to me
> > that
> > > > > there is no elasticity mechanism in place. I opened an issue in
> Jira
> > > some
> > > > > months ago here, which contains most of the content of this email :
> > > > > https://issues.apache.org/jira/browse/MESOS-2453
> > > > >
> > > > > Here is what I have in mind (ppt in the following link for the
> > detailed
> > > > > and visual version ☺ ) :
> > > > > - Add the possibility for a framework to signal that it has some
> work
> > > > > pending (with or without further semantics regarding what resources
> > is
> > > > > wished ?)
> > > > > - Modify the Mesos algo to call a pluggable driver when no resource
> > is
> > > > > available and at least one framework has some work to do.
> > > > >    In this case the driver should scale up the Mesos cluster by
> > > launching
> > > > > VMs. How much and of which size is a little tricky here without
> > adding
> > > > > semantics to the framework signal.
> > > > > - We should also add a flag somewhere to mark the slave as
> "volatile"
> > > so
> > > > > we can prefer the use of static resources, and shut down the
> volatile
> > > > > slaves after some time left unused.
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/presentation/d/1eNQSvDQ64gPNbmf0YVPq9tIWLMCbAHExos5WXrm0uqI/edit?usp=sharing
> > > > >
> > > > > Does it look doable to you ? what do you think about the principle
> ?
> > > > > Do you think we can add some semantics to the "I have work to do"
> > > > > framework signal without breaking the two-level scheduling
> principle
> > ?
> > > > > I don't think it violates it since both mechanisms (signaling a
> need
> > > and
> > > > > effectively take a resource from an offer) are fully independent in
> > my
> > > > > proposal but I feel a little out of my league to be sure.
> > > > >
> > > > > This proposal currently doesn't specifically address bin packing,
> > > however
> > > > > with the aforementioned modifications in place it should be easy to
> > add
> > > > > since we know which resources are volatile.
> > > > >
> > > > > I have seen some other work (by Netflix for example) address this
> > > problem
> > > > > however it always seems to be at the framework level and not inside
> > the
> > > > > core Mesos architecture, is there a reason for that except lack of
> > time
> > > > for
> > > > > specification/contribution ?
> > > > >
> > > > >
> > > >
> > >
> >
> http://fr.slideshare.net/spodila/aws-reinvent-2014-talk-scheduling-using-apache-mesos-in-the-cloud
> > > > >
> > > > > Regards,
> > > > >
> > > > > Mathieu Velten
> > > > > Ce message et toutes les pièces jointes (ci-après le "message")
> sont
> > > > > établis à l’intention exclusive des destinataires désignés. Il
> > contient
> > > > des
> > > > > informations confidentielles et pouvant être protégé par le secret
> > > > > professionnel. Si vous recevez ce message par erreur, merci d'en
> > > avertir
> > > > > immédiatement l'expéditeur et de détruire le message. Toute
> > utilisation
> > > > de
> > > > > ce message non conforme à sa destination, toute diffusion ou toute
> > > > > publication, totale ou partielle, est interdite, sauf autorisation
> > > > expresse
> > > > > de l’émetteur. L'internet ne garantissant pas l'intégrité de ce
> > message
> > > > > lors de son acheminement, Atos (et ses filiales) décline(nt) toute
> > > > > responsabilité au titre de son contenu. Bien que ce message ait
> fait
> > > > > l’objet d’un traitement anti-virus lors de son envoi, l’émetteur ne
> > > peut
> > > > > garantir l’absence totale de logiciels malveillants dans son
> contenu
> > et
> > > > ne
> > > > > pourrait être tenu pour responsable des dommages engendrés par la
> > > > > transmission de l’un d’eux.
> > > > >
> > > > > This message and any attachments (the "message") are intended
> solely
> > > for
> > > > > the addressee(s). It contains confidential information, that may be
> > > > > privileged. If you receive this message in error, please notify the
> > > > sender
> > > > > immediately and delete the message. Any use of the message in
> > violation
> > > > of
> > > > > its purpose, any dissemination or disclosure, either wholly or
> > > partially
> > > > is
> > > > > strictly prohibited, unless it has been explicitly authorized by
> the
> > > > > sender. As its integrity cannot be secured on the internet, Atos
> and
> > > its
> > > > > subsidiaries decline any liability for the content of this message.
> > > > > Although the sender endeavors to maintain a computer virus-free
> > > network,
> > > > > the sender does not warrant that this transmission is virus-free
> and
> > > will
> > > > > not be liable for any damages resulting from any virus transmitted.
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>

Re: Autoscaling in an IaaS environment

Reply via email to