This is designed for a different purpose, but technically, seems to be
well suited for the policy-based resource revocation.
The comment
// A request to "deallocate" or "return" any resources already
// being consumed by the framework.
might contradict availability of a "resources" field in the message
protobuf
message InverseOffer {
required OfferID id = 1;
required FrameworkID framework_id = 2;
repeated Resource resources = 3;
In any case, we need a partial revocation of resources, currently consumed
by a framework. Naturally, the implementation is up to a framework
developer, so having an input on amount of resources to be freed, makes
this possible.
Also, the "unavailability" parameter, and a feedback mechanism, are useful
as well - we can specify a timeout (grace period) for deallocation, and
enable a framework to decline a revocation request for a good reason.
Regards,
Gidon
From: Alex Rukletsov <[email protected]>
To: dev <[email protected]>
Date: 09/07/2015 09:01 PM
Subject: Re: Task revocation
I think we have a similar concept already: InverseOffers [1]. Is it what
you mean?
[1]: https://issues.apache.org/jira/browse/MESOS-2061
On Thu, Jul 9, 2015 at 4:14 PM, Gidon Gershinsky <[email protected]> wrote:
> By the way, a thought - why don't we send negative resource offers, from
> an allocation plug-in module, to the frameworks (via the standard
> offerCallback API)...
> Not as clean as a first-class resource revoke API, but maybe not too bad
> either. Will verify if this actually works in the current implementation
> (there could be sanity checks on offer values).
>
>
>
> Regards,
> Gidon
>
>
>
>
>
>
>
> From: Gidon Gershinsky/Haifa/IBM@IBMIL
> To: [email protected]
> Date: 07/07/2015 06:13 PM
> Subject: Re: Task revocation
>
>
>
> Having a kill action will work fine for our module. By the way, our
policy
>
> mechanism doesn't kill individual tasks by their IDs, but rather kills
> some task(s) of frameworks (or roles) that currently use more resources
> than they deserve - to free up a defined amount of cpu/mem/etc. Hence,
the
>
> allocator continues to work with the framework/role entities (offer
> resources, or revoke resources), we wont need to handle individual tasks
> there.
>
>
> Regards,
> Gidon
>
>
>
>
>
>
>
> From: Alex Rukletsov <[email protected]>
> To: dev <[email protected]>
> Date: 07/07/2015 05:45 PM
> Subject: Re: Task revocation
>
>
>
> I think we should move away from the callback architecture in Allocator
> and
> avoid introducing new callbacks. In order to implement things like
> optimistic offers [1], we may need to generate and bookkeep resource
> offers
> in an allocator. In this case, a cleaner and more general solution is to
> keep a queue of actions in the allocator and let the master fetch from
> this
> queue. Actions might be: resource offer, reservation order, kill task
and
> so on. This will also allow us to introduce and delegate new actions to
> allocators without changing the Allocator interface, without breaking
old
> allocators that do not support new actions.
>
> With allocator being modularized now and given there are some ongoing
> efforts around alternative allocators, maybe we should start a
discussion
> about cleaning up the Allocator interface?
>
> [1] https://issues.apache.org/jira/browse/MESOS-1607
>
> On Tue, Jul 7, 2015 at 3:45 PM, Gidon Gershinsky <[email protected]>
wrote:
>
> > Adam,
> >
> > Having an HTTP API to kill a task would help.
> > I'd also suggest considering a 'native' killTask C++ API that can be
> > called from within a resource allocation module (
> > https://issues.apache.org/jira/browse/MESOS-2160).
> > This is a more direct route, since the module runs inside the master
> > process.
> > Currently, an allocation module is given one callback (to send
resource
> > offers) upon initialization. Technically, it should be possible to add
> > another callback, for killing the tasks.
> > Authorization-wise, its not a problem, given the power already
provided
> to
> > such plug-in modules, that can easily starve frameworks etc.
> >
> >
> > From: Adam Bordelon <[email protected]>
> > Date: Mon, Jul 6, 2015 at 11:48 AM
> > Subject: Re: Task revocation
> > To: dev <[email protected]>
> >
> >
> > Gidon,
> >
> > If your allocation module is capable of sending protobuf messages to
the
> > master, you could send a KillTaskMessage with the proper frameworkId
and
> > taskId and hack a way around the if condition at
> >
>
https://github.com/apache/mesos/blob/0.23.0-rc1/src/master/master.cpp#L2946
>
>
> >
> > I think in general, it would be great to add an
authenticated/authorized
> > killTask endpoint on the master, for operators, tools, or services
like
> > your allocator to use to kill tasks. This may be incorporated in the
> > upcoming HTTP API redesign.
> >
> > On Thu, Jul 2, 2015 at 8:42 AM, Gidon Gershinsky <[email protected]>
> wrote:
> >
> > > Thanks.
> > >
> > > Yep, that's what I meant by an out-of-band option. I wonder if there
> is
> > a
> > > direct way to kill a task from inside a resource allocation module.
Or
> > to
> > > send a message from the resource allocation module to a framework.
> > >
> > > On Thu, Jul 2, 2015 at 6:09 PM, haosdent <[email protected]> wrote:
> > >
> > > > Hi, @Gidon. In fact, when you call framework.killTask. The
framework
> > > would
> > > > send a KillTaskMessage contains frameworkId and taskId to Master.
> And
> > > then
> > > > Master forward this message to Slave. Ask Slave to kill this task.
> > > >
> > > > On Thu, Jul 2, 2015 at 10:18 PM, Gidon Gershinsky
<[email protected]>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > We are developing a resource allocation module, using the new
> > plug-in
> > > > > mechanism (taken directly from Github, thanks Alex, works
smooth).
> > > > >
> > > > > Our module will need to revoke/kill tasks, eg to make space for
> more
> > > > > important ones (say when they call driver.requestResources, and
no
> > > > > resources are available). I know task revocation can't be done
> > > currently
> > > > > from the master/module, but there is a driver API that enables
> > > frameworks
> > > > > to kill their tasks.
> > > > >
> > > > > I can implement an out-of-band mechanism, where the module will
> > > > communicate
> > > > > with frameworks on a proprietary protocol, and tell them to kill
a
> > > task.
> > > > > But I wonder if there is another option for master-framework
> > messaging
> > > > > (callable from the allocation module), maybe similar to
> > > > executor-framework
> > > > > messaging? Or plans to add kill task API in the master/module?
> > > > >
> > > > > Thanks, Gidon
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards,
> > > > Haosdent Huang
> > > >
> > >
> >
>
>
>