Re: Task revocation

Alex Rukletsov Thu, 09 Jul 2015 11:02:47 -0700

I think we have a similar concept already: InverseOffers [1]. Is it what
you mean?


[1]: https://issues.apache.org/jira/browse/MESOS-2061

On Thu, Jul 9, 2015 at 4:14 PM, Gidon Gershinsky <[email protected]> wrote:

> By the way, a thought - why don't we send negative resource offers, from
> an allocation plug-in module, to the frameworks (via the standard
> offerCallback API)...
> Not as clean as a first-class resource revoke API, but maybe not too bad
> either. Will verify if this actually works in the current implementation
> (there could be sanity checks on offer values).
>
>
>
> Regards,
> Gidon
>
>
>
>
>
>
>
> From:   Gidon Gershinsky/Haifa/IBM@IBMIL
> To:     [email protected]
> Date:   07/07/2015 06:13 PM
> Subject:        Re: Task revocation
>
>
>
> Having a kill action will work fine for our module. By the way, our policy
>
> mechanism doesn't kill individual tasks by their IDs, but rather kills
> some task(s) of frameworks (or roles) that currently use more resources
> than they deserve - to free up a defined amount of cpu/mem/etc. Hence, the
>
> allocator continues to work with the framework/role entities (offer
> resources, or revoke resources), we wont need to handle individual tasks
> there.
>
>
> Regards,
> Gidon
>
>
>
>
>
>
>
> From:   Alex Rukletsov <[email protected]>
> To:     dev <[email protected]>
> Date:   07/07/2015 05:45 PM
> Subject:        Re: Task revocation
>
>
>
> I think we should move away from the callback architecture in Allocator
> and
> avoid introducing new callbacks. In order to implement things like
> optimistic offers [1], we may need to generate and bookkeep resource
> offers
> in an allocator. In this case, a cleaner and more general solution is to
> keep a queue of actions in the allocator and let the master fetch from
> this
> queue. Actions might be: resource offer, reservation order, kill task and
> so on. This will also allow us to introduce and delegate new actions to
> allocators without changing the Allocator interface, without breaking old
> allocators that do not support new actions.
>
> With allocator being modularized now and given there are some ongoing
> efforts around alternative allocators, maybe we should start a discussion
> about cleaning up the Allocator interface?
>
> [1] https://issues.apache.org/jira/browse/MESOS-1607
>
> On Tue, Jul 7, 2015 at 3:45 PM, Gidon Gershinsky <[email protected]> wrote:
>
> > Adam,
> >
> > Having an HTTP API to kill a task would help.
> > I'd also suggest considering a 'native' killTask C++ API that can be
> > called from within a resource allocation module (
> > https://issues.apache.org/jira/browse/MESOS-2160).
> > This is a more direct route, since the module runs inside the master
> > process.
> > Currently, an allocation module is given one callback (to send resource
> > offers) upon initialization. Technically, it should be possible to add
> > another callback, for killing the tasks.
> > Authorization-wise, its not a problem, given the power already provided
> to
> > such plug-in modules, that can easily starve frameworks etc.
> >
> >
> > From: Adam Bordelon <[email protected]>
> > Date: Mon, Jul 6, 2015 at 11:48 AM
> > Subject: Re: Task revocation
> > To: dev <[email protected]>
> >
> >
> > Gidon,
> >
> > If your allocation module is capable of sending protobuf messages to the
> > master, you could send a KillTaskMessage with the proper frameworkId and
> > taskId and hack a way around the if condition at
> >
> https://github.com/apache/mesos/blob/0.23.0-rc1/src/master/master.cpp#L2946
>
>
> >
> > I think in general, it would be great to add an authenticated/authorized
> > killTask endpoint on the master, for operators, tools, or services like
> > your allocator to use to kill tasks. This may be incorporated in the
> > upcoming HTTP API redesign.
> >
> > On Thu, Jul 2, 2015 at 8:42 AM, Gidon Gershinsky <[email protected]>
> wrote:
> >
> > > Thanks.
> > >
> > > Yep, that's what I meant by an out-of-band option. I wonder if there
> is
> > a
> > > direct way to kill a task from inside a resource allocation module. Or
> > to
> > > send a message from the resource allocation module to a framework.
> > >
> > > On Thu, Jul 2, 2015 at 6:09 PM, haosdent <[email protected]> wrote:
> > >
> > > > Hi, @Gidon. In fact, when you call framework.killTask. The framework
> > > would
> > > > send a KillTaskMessage contains frameworkId and taskId to Master.
> And
> > > then
> > > > Master forward this message to Slave. Ask Slave to kill this task.
> > > >
> > > > On Thu, Jul 2, 2015 at 10:18 PM, Gidon Gershinsky <[email protected]>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > We are developing a resource allocation module, using the new
> > plug-in
> > > > > mechanism (taken directly from Github, thanks Alex, works smooth).
> > > > >
> > > > > Our module will need to revoke/kill tasks, eg to make space for
> more
> > > > > important ones (say when they call driver.requestResources, and no
> > > > > resources are available). I know task revocation can't be done
> > > currently
> > > > > from the master/module, but there is a driver API that enables
> > > frameworks
> > > > > to kill their tasks.
> > > > >
> > > > > I can implement an out-of-band mechanism, where the module will
> > > > communicate
> > > > > with frameworks on a proprietary protocol, and tell them to kill a
> > > task.
> > > > > But I wonder if there is another option for master-framework
> > messaging
> > > > > (callable from the allocation module), maybe similar to
> > > > executor-framework
> > > > > messaging? Or plans to add kill task API in the master/module?
> > > > >
> > > > > Thanks, Gidon
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards,
> > > > Haosdent Huang
> > > >
> > >
> >
>
>
>

Re: Task revocation

Reply via email to