Hi Benjamin

Right, Mesos has to orchestrate shrink, for example notify framework to
gracefully terminate workload, or even make the schedule decision which
host will be closed and reclaimed. However it does not mean Mesos has to be
built with policy to trigger the auto-scale.

The policy of auto-scale Mesos cluster itself is trying to meet the overall
SLA of Mesos cluster, but may not be the SLA of specific framework, even
thought they may be relevant.

It is probably better to ask Mesos focus on resource sharing among
framework to meet the SLA of framework, while an outside Auto-scaler to
monitor the Mesos and work with Mesos to meet the SLA of Mesos (all the
frameworks).

Thanks,

Yong

On Mon, Aug 3, 2015 at 2:34 PM, Benjamin Mahler <benjamin.mah...@gmail.com>
wrote:

> With auto-scaling, shrinking is not as easy as growing. For example, we may
> need to "defragment" the cluster in order to shrink the number of slaves,
> and mesos seems to be in the best position to orchestrate such a process if
> you want do this based on framework's SLA constraints (would re-use inverse
> offers).
>
> On Sun, Aug 2, 2015 at 5:35 PM, Yong Feng <fengyong...@gmail.com> wrote:
>
> > I prefer an auto-scaler outside mesos as well. As long as Mesos exports
> > enough statistics, an outside auto-scaler should be able to make the
> > auto-scale decision as smart as Mesos itself. It will also help to douple
> > the resource scheduilng from resource infrastructure management. Mesos
> just
> > need focus on how to support adding/removing nodes dynanicly and
> gracefully
> > without impact running workload such as feature of host
> > maintenance/removing ....
> >
> > Besides, exproting statistics also helps on Mesos
> > diagnosing/troubleshooting, simulation, profiling and so on.
> >
> > The only case an auto-scaler may not support is that the auto-scale
> decison
> > may have impact on sceduling decison for exapmle resource mamanger (like
> > Mesos) don't have to reclaim a framework if new nodes with required
> > resources will be added. However we even could argue whether it is a
> valid
> > use case that we ask scheduling decison depends on auto-scale decsion.
> >
> > Thanks
> >
> >
> > On Sun, Aug 2, 2015 at 12:56 PM, tommy xiao <xia...@gmail.com> wrote:
> >
> > > my want:  write a daemon to query mesos framework api, get the
> statistics
> > > from mesos api. then invoke the IaaS's API to scale the cluster size.
> > >
> > > 2015-08-02 22:32 GMT+08:00 Alex Rukletsov <a...@mesosphere.com>:
> > >
> > > > I agree with Vinod that the Master accumulates a lot of statistics
> that
> > > can
> > > > be used for smarter decisions about cluster scaling. However, I'm not
> > > sure
> > > > this feature should reside in Mesos. I would rather expose statistics
> > > and /
> > > > or recommendations and let external tooling or an operator do the
> job.
> > > > On 31 Jul 2015 7:15 pm, "Vinod Kone" <vinodk...@gmail.com> wrote:
> > > >
> > > > > Thanks for pinging again Mathieu!
> > > > >
> > > > > I think auto-scaling of a Mesos cluster is a nifty feature to have.
> > The
> > > > > only question in my mind (and likely others) is whether this
> > > > functionality
> > > > > should reside in Mesos, or a framework or an operator. As you
> > > mentioned,
> > > > > Netflix took the framework way but it doesn't necessarily work in a
> > > > > multi-framework environment. If the functionality lies with an
> > operator
> > > > it
> > > > > has to be a library (likely a service) so that more people can take
> > > > > advantage of it.
> > > > >
> > > > > In my mind, it is not hard to imagine having this functionality in
> > > Mesos.
> > > > > Since Mesos is in the best position to know the (current and
> perhaps
> > > > > projected) state of the cluster it could make smart decisions about
> > the
> > > > > shape and size of the new nodes that can be added. This also
> becomes
> > > > > interesting in the face of the quota
> > > > > <https://issues.apache.org/jira/browse/MESOS-1791> work that we
> are
> > > > > currently doing.
> > > > >
> > > > > Having said that, I think you can do this today by writing an
> > allocator
> > > > > module. Note that Mesos already provides a requestResources() API
> > call
> > > > > (similar to Wish in your ppt) that is passed to the allocator. You
> > > should
> > > > > be able to write an allocator module that takes this signal and
> talks
> > > to
> > > > > your favorite IaaS API to spin up new node(s) if necessary.
> > > > >
> > > > >
> > > > > On Fri, Jul 31, 2015 at 8:29 AM, Roger Ignazio <rigna...@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > With the number of IaaS providers out there, and the fact that
> > Mesos
> > > > > > doesn't really concern itself with where it's running (IaaS,
> > > > bare-metal,
> > > > > > on-prem, in the cloud), this sounds more like an operations
> problem
> > > > than
> > > > > a
> > > > > > feature that should be in Mesos core.
> > > > > >
> > > > > > By any chance, have you had a chance to look at
> > > > > > https://github.com/thefactory/autoscale-python? I'd venture to
> > guess
> > > > > that
> > > > > > project (or a homegrown solution talking to your IaaS' API),
> > combined
> > > > > with
> > > > > > some custom AWS AMIs (or vSphere templates or OpenStack images or
> > > ...),
> > > > > > would satisfy your use-case.
> > > > > >
> > > > > > -- Roger
> > > > > >
> > > > > > On Fri, Jul 31, 2015 at 5:37 AM, VELTEN, MATHIEU <
> > > > > mathieu.vel...@atos.net>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I am currently working for some projects using Mesos at Atos
> > > Toulouse
> > > > > and
> > > > > > > we are using it on top of a classical IaaS.
> > > > > > >
> > > > > > > After playing with Mesos and looking at some code it appears to
> > me
> > > > that
> > > > > > > there is no elasticity mechanism in place. I opened an issue in
> > > Jira
> > > > > some
> > > > > > > months ago here, which contains most of the content of this
> > email :
> > > > > > > https://issues.apache.org/jira/browse/MESOS-2453
> > > > > > >
> > > > > > > Here is what I have in mind (ppt in the following link for the
> > > > detailed
> > > > > > > and visual version ☺ ) :
> > > > > > > - Add the possibility for a framework to signal that it has
> some
> > > work
> > > > > > > pending (with or without further semantics regarding what
> > resources
> > > > is
> > > > > > > wished ?)
> > > > > > > - Modify the Mesos algo to call a pluggable driver when no
> > resource
> > > > is
> > > > > > > available and at least one framework has some work to do.
> > > > > > >    In this case the driver should scale up the Mesos cluster by
> > > > > launching
> > > > > > > VMs. How much and of which size is a little tricky here without
> > > > adding
> > > > > > > semantics to the framework signal.
> > > > > > > - We should also add a flag somewhere to mark the slave as
> > > "volatile"
> > > > > so
> > > > > > > we can prefer the use of static resources, and shut down the
> > > volatile
> > > > > > > slaves after some time left unused.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/presentation/d/1eNQSvDQ64gPNbmf0YVPq9tIWLMCbAHExos5WXrm0uqI/edit?usp=sharing
> > > > > > >
> > > > > > > Does it look doable to you ? what do you think about the
> > principle
> > > ?
> > > > > > > Do you think we can add some semantics to the "I have work to
> do"
> > > > > > > framework signal without breaking the two-level scheduling
> > > principle
> > > > ?
> > > > > > > I don't think it violates it since both mechanisms (signaling a
> > > need
> > > > > and
> > > > > > > effectively take a resource from an offer) are fully
> independent
> > in
> > > > my
> > > > > > > proposal but I feel a little out of my league to be sure.
> > > > > > >
> > > > > > > This proposal currently doesn't specifically address bin
> packing,
> > > > > however
> > > > > > > with the aforementioned modifications in place it should be
> easy
> > to
> > > > add
> > > > > > > since we know which resources are volatile.
> > > > > > >
> > > > > > > I have seen some other work (by Netflix for example) address
> this
> > > > > problem
> > > > > > > however it always seems to be at the framework level and not
> > inside
> > > > the
> > > > > > > core Mesos architecture, is there a reason for that except lack
> > of
> > > > time
> > > > > > for
> > > > > > > specification/contribution ?
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://fr.slideshare.net/spodila/aws-reinvent-2014-talk-scheduling-using-apache-mesos-in-the-cloud
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > Mathieu Velten
> > > > > > > Ce message et toutes les pièces jointes (ci-après le "message")
> > > sont
> > > > > > > établis à l’intention exclusive des destinataires désignés. Il
> > > > contient
> > > > > > des
> > > > > > > informations confidentielles et pouvant être protégé par le
> > secret
> > > > > > > professionnel. Si vous recevez ce message par erreur, merci
> d'en
> > > > > avertir
> > > > > > > immédiatement l'expéditeur et de détruire le message. Toute
> > > > utilisation
> > > > > > de
> > > > > > > ce message non conforme à sa destination, toute diffusion ou
> > toute
> > > > > > > publication, totale ou partielle, est interdite, sauf
> > autorisation
> > > > > > expresse
> > > > > > > de l’émetteur. L'internet ne garantissant pas l'intégrité de ce
> > > > message
> > > > > > > lors de son acheminement, Atos (et ses filiales) décline(nt)
> > toute
> > > > > > > responsabilité au titre de son contenu. Bien que ce message ait
> > > fait
> > > > > > > l’objet d’un traitement anti-virus lors de son envoi,
> l’émetteur
> > ne
> > > > > peut
> > > > > > > garantir l’absence totale de logiciels malveillants dans son
> > > contenu
> > > > et
> > > > > > ne
> > > > > > > pourrait être tenu pour responsable des dommages engendrés par
> la
> > > > > > > transmission de l’un d’eux.
> > > > > > >
> > > > > > > This message and any attachments (the "message") are intended
> > > solely
> > > > > for
> > > > > > > the addressee(s). It contains confidential information, that
> may
> > be
> > > > > > > privileged. If you receive this message in error, please notify
> > the
> > > > > > sender
> > > > > > > immediately and delete the message. Any use of the message in
> > > > violation
> > > > > > of
> > > > > > > its purpose, any dissemination or disclosure, either wholly or
> > > > > partially
> > > > > > is
> > > > > > > strictly prohibited, unless it has been explicitly authorized
> by
> > > the
> > > > > > > sender. As its integrity cannot be secured on the internet,
> Atos
> > > and
> > > > > its
> > > > > > > subsidiaries decline any liability for the content of this
> > message.
> > > > > > > Although the sender endeavors to maintain a computer virus-free
> > > > > network,
> > > > > > > the sender does not warrant that this transmission is
> virus-free
> > > and
> > > > > will
> > > > > > > not be liable for any damages resulting from any virus
> > transmitted.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Deshi Xiao
> > > Twitter: xds2000
> > > E-mail: xiaods(AT)gmail.com
> > >
> >
>

Reply via email to