(1) When the ShutdownProcess is gone, we'll adjust the timeout calculation
logic.

(2) In the proposal, we tie the grace period to the executor, but use it
for task finalization. Effectively we assume that each executor launches
similar tasks. This may seem a bit weird, but having timeouts per task
makes the change more intrusive. I would propose to revisit this concept
when we start discussing a more general "finalization" approach.

(3) I think the concept of custom finalization may be useful. Anyway, we
need to have a timeout for the cases when finalization is stuck or takes
too much time. It would be nice to have a single flag for such timeout. Now
we already have two: one for executor shutdown and one for the docker stop
timeout.

I'm glad that we all agree current grace shutdown configuration needs some
love. I'll follow up with the patches soon.

On Thu, Nov 13, 2014 at 9:59 PM, Benjamin Mahler <[email protected]>
wrote:

> Short term, cleaning up the current static configuration mess sounds good.
>
> Some food for thought for the longer term:
>
> (1) Keep in mind that the ShutdownProcess inside ExecutorProcess will be
> going away when we have pure language bindings.
>
> (2) Killing a task and shutting down an executor are independent concepts.
> Unfortunately, executor shutdown is not exposed in the framework API yet.
> So, when it comes to custom executors, the grace period (or whatever other
> finalization [1]) is *completely* in the hands of the framework.
>
> (3) We provide CommandExecutor as a convenience, and as we've discovered
> broadly useful concepts for frameworks, like health checking, we've added
> them in. It sounds like doing "finalization" might be another broadly
> useful concept, wherein much like they can control the definition of a
> health check, they will want to control the definition of "finalization".
>
> Thoughts?
>
> [1] I believe Thermos exposes some finalization, might be useful to
> reference:
>
> http://aurora.incubator.apache.org/documentation/latest/configuration-reference/#final
>
> On Wed, Nov 12, 2014 at 3:06 PM, Niklas Nielsen <[email protected]>
> wrote:
>
> > I thought signal escalation as per-executor or actually everywhere where
> we
> > execute a command info as a subprocess.
> > The new grace period is meant as the time an executor has to finish off
> > it's things - changing the other timeouts had to be done as they will in
> > most cases be shorter.
> > For custom executors, it is up to themselves to honor the timeout; or
> else,
> > the executor process will kill it after timeout + delta time.
> >
> > Ben, are you thinking of a more generalized finalization mechanism
> > (pluggable, programmable)?
> >
> > Niklas
> >
> > On 11 November 2014 10:34, Alex Rukletsov <[email protected]> wrote:
> >
> > > Ben,
> > >
> > > there are two scenarios: executor shutdown and killTask() in
> > > CommandExecutor. For the first use case, each custom executor is
> affected
> > > through the ExecutorProcess, that means two levels are involved
> > > (containerizer and executor) and should be synchronized.
> > >
> > > In the second scenario, each task is tied to its own CommandExecutor,
> > > therefore killing a task implies killing its executor. In this case,
> > grace
> > > shutdown period becomes also a signal escalation timeout and conflating
> > > them together, I think, is a good idea. The proposed design doc is an
> > > effort to align timeouts along the chain from slave to CommanExecutor.
> > >
> > > If I understand you correctly, we want to shutdown any executor (task)
> > > gracefully, and do not tie grace period to CommandExecutor only. A good
> > > example pointed by Ankur Chauhan is MESOS-1925
> > > <https://issues.apache.org/jira/browse/MESOS-1925>: we can reuse reuse
> > the
> > > same grace shutdown flag for dockers. And if we later enable frameworks
> > to
> > > adjust timeouts for its tasks (or executors, to be precise), we will be
> > > able to align the timeout used by docker finalization with the timeout
> in
> > > docker container.
> > >
> > > On Mon, Nov 10, 2014 at 10:00 PM, Benjamin Mahler <
> > > [email protected]
> > > > wrote:
> > >
> > > > I'm guessing most of the motivation here is actually for task killing
> > > > escalation in the command executor? The shutdown grace period was
> > > designed
> > > > for executor shutdown only, which today occurs only when the
> framework
> > is
> > > > being shutdown (or recovery is cleaning up), or in the future, when
> > > > frameworks ask to shutdown a specific executor.
> > > >
> > > > In the case of the command executor, the slave won't do any
> escalation
> > > when
> > > > a killTask arrives, since it's not trying to shutdown the executor.
> For
> > > > simplicity (I'm guessing), we conflated the executor shutdown grace
> > > period,
> > > > with the killTask signal escalation in the command executor.
> > > >
> > > > So, I'm still trying to figure out the concrete use case here, is it
> > that
> > > > you have command-tasks that implement a clean shutdown driven by
> > SIGTERM?
> > > > Going forward, is that enough or would we want a more general notion
> of
> > > > "Finalization" (e.g. driven by HTTP, or SIGTERM, or subprocess, etc),
> > > much
> > > > like the generic health checking that was added.
> > > >
> > > > On Mon, Nov 10, 2014 at 8:08 AM, Alex Rukletsov <[email protected]>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I would like to share the design doc for configurable grace period
> > > > > <
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1_b3OPv3tjkub1T6VhQ27GnDfbVjnJ6IQ4ufPQhV1HM8/edit?usp=sharing
> > > > > >.
> > > > > The doc describes two approaches to calculate nested grace periods,
> > > > points
> > > > > out implementation details and opens several design questions.
> > > > >
> > > > > I would highly appreciate any thoughts, ideas and suggestions!
> > > > >
> > > > > Thanks,
> > > > > Alex
> > > > >
> > > >
> > >
> >
>

Reply via email to