I think we misunderstand each other a bit. Usually under "executor" I
understand every executor, except the base one, ExecutorProcess.

If under task finalization we understand sending SIGTERM / SIGKILL to a
child process, then you are right, it's only relevant for the
CommandExecutor. But custom (you call them regular) executors may reuse the
grace shutdown field (that may be set by the framework scheduler) to do
their own finalization. As you mention, since the grace shutdown timeout is
used by containerizer to ensure executor cleans up, it will affect all
executors during executor shutdown.

What I wanted to outline in (2) is that there is one timeout per executor,
so though custom (regular) executors may use it for their own task
finalization, the outer containerizer's timeout is set once per executor,
which means an executor may be killed though it still waits for some of its
tasks to finalize.

On Fri, Nov 14, 2014 at 8:48 PM, Benjamin Mahler <[email protected]>
wrote:

> For (2), sorry, I get a bit confused when "executor" and "CommandExecutor"
> are used interchangeably.
>
> So let me confirm we're on the same page: Task finalization is only
> relevant for the CommandExecutor, that's where the CommandExecutor assumes
> that all tasks need a similar grace period. For regular executors, the
> finalization of a task is entirely in the executor's hands, the slave does
> not use a grace period here to impose anything. For all executors
> (CommandExecutor included), a grace period will be used when the executor
> is being shutdown. Does this match your understanding?
>
> On Fri, Nov 14, 2014 at 5:37 AM, Alex Rukletsov <[email protected]>
> wrote:
>
> > (1) When the ShutdownProcess is gone, we'll adjust the timeout
> calculation
> > logic.
> >
> > (2) In the proposal, we tie the grace period to the executor, but use it
> > for task finalization. Effectively we assume that each executor launches
> > similar tasks. This may seem a bit weird, but having timeouts per task
> > makes the change more intrusive. I would propose to revisit this concept
> > when we start discussing a more general "finalization" approach.
> >
> > (3) I think the concept of custom finalization may be useful. Anyway, we
> > need to have a timeout for the cases when finalization is stuck or takes
> > too much time. It would be nice to have a single flag for such timeout.
> Now
> > we already have two: one for executor shutdown and one for the docker
> stop
> > timeout.
> >
> > I'm glad that we all agree current grace shutdown configuration needs
> some
> > love. I'll follow up with the patches soon.
> >
> > On Thu, Nov 13, 2014 at 9:59 PM, Benjamin Mahler <
> > [email protected]>
> > wrote:
> >
> > > Short term, cleaning up the current static configuration mess sounds
> > good.
> > >
> > > Some food for thought for the longer term:
> > >
> > > (1) Keep in mind that the ShutdownProcess inside ExecutorProcess will
> be
> > > going away when we have pure language bindings.
> > >
> > > (2) Killing a task and shutting down an executor are independent
> > concepts.
> > > Unfortunately, executor shutdown is not exposed in the framework API
> yet.
> > > So, when it comes to custom executors, the grace period (or whatever
> > other
> > > finalization [1]) is *completely* in the hands of the framework.
> > >
> > > (3) We provide CommandExecutor as a convenience, and as we've
> discovered
> > > broadly useful concepts for frameworks, like health checking, we've
> added
> > > them in. It sounds like doing "finalization" might be another broadly
> > > useful concept, wherein much like they can control the definition of a
> > > health check, they will want to control the definition of
> "finalization".
> > >
> > > Thoughts?
> > >
> > > [1] I believe Thermos exposes some finalization, might be useful to
> > > reference:
> > >
> > >
> >
> http://aurora.incubator.apache.org/documentation/latest/configuration-reference/#final
> > >
> > > On Wed, Nov 12, 2014 at 3:06 PM, Niklas Nielsen <[email protected]>
> > > wrote:
> > >
> > > > I thought signal escalation as per-executor or actually everywhere
> > where
> > > we
> > > > execute a command info as a subprocess.
> > > > The new grace period is meant as the time an executor has to finish
> off
> > > > it's things - changing the other timeouts had to be done as they will
> > in
> > > > most cases be shorter.
> > > > For custom executors, it is up to themselves to honor the timeout; or
> > > else,
> > > > the executor process will kill it after timeout + delta time.
> > > >
> > > > Ben, are you thinking of a more generalized finalization mechanism
> > > > (pluggable, programmable)?
> > > >
> > > > Niklas
> > > >
> > > > On 11 November 2014 10:34, Alex Rukletsov <[email protected]>
> wrote:
> > > >
> > > > > Ben,
> > > > >
> > > > > there are two scenarios: executor shutdown and killTask() in
> > > > > CommandExecutor. For the first use case, each custom executor is
> > > affected
> > > > > through the ExecutorProcess, that means two levels are involved
> > > > > (containerizer and executor) and should be synchronized.
> > > > >
> > > > > In the second scenario, each task is tied to its own
> CommandExecutor,
> > > > > therefore killing a task implies killing its executor. In this
> case,
> > > > grace
> > > > > shutdown period becomes also a signal escalation timeout and
> > conflating
> > > > > them together, I think, is a good idea. The proposed design doc is
> an
> > > > > effort to align timeouts along the chain from slave to
> > CommanExecutor.
> > > > >
> > > > > If I understand you correctly, we want to shutdown any executor
> > (task)
> > > > > gracefully, and do not tie grace period to CommandExecutor only. A
> > good
> > > > > example pointed by Ankur Chauhan is MESOS-1925
> > > > > <https://issues.apache.org/jira/browse/MESOS-1925>: we can reuse
> > reuse
> > > > the
> > > > > same grace shutdown flag for dockers. And if we later enable
> > frameworks
> > > > to
> > > > > adjust timeouts for its tasks (or executors, to be precise), we
> will
> > be
> > > > > able to align the timeout used by docker finalization with the
> > timeout
> > > in
> > > > > docker container.
> > > > >
> > > > > On Mon, Nov 10, 2014 at 10:00 PM, Benjamin Mahler <
> > > > > [email protected]
> > > > > > wrote:
> > > > >
> > > > > > I'm guessing most of the motivation here is actually for task
> > killing
> > > > > > escalation in the command executor? The shutdown grace period was
> > > > > designed
> > > > > > for executor shutdown only, which today occurs only when the
> > > framework
> > > > is
> > > > > > being shutdown (or recovery is cleaning up), or in the future,
> when
> > > > > > frameworks ask to shutdown a specific executor.
> > > > > >
> > > > > > In the case of the command executor, the slave won't do any
> > > escalation
> > > > > when
> > > > > > a killTask arrives, since it's not trying to shutdown the
> executor.
> > > For
> > > > > > simplicity (I'm guessing), we conflated the executor shutdown
> grace
> > > > > period,
> > > > > > with the killTask signal escalation in the command executor.
> > > > > >
> > > > > > So, I'm still trying to figure out the concrete use case here, is
> > it
> > > > that
> > > > > > you have command-tasks that implement a clean shutdown driven by
> > > > SIGTERM?
> > > > > > Going forward, is that enough or would we want a more general
> > notion
> > > of
> > > > > > "Finalization" (e.g. driven by HTTP, or SIGTERM, or subprocess,
> > etc),
> > > > > much
> > > > > > like the generic health checking that was added.
> > > > > >
> > > > > > On Mon, Nov 10, 2014 at 8:08 AM, Alex Rukletsov <
> > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I would like to share the design doc for configurable grace
> > period
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1_b3OPv3tjkub1T6VhQ27GnDfbVjnJ6IQ4ufPQhV1HM8/edit?usp=sharing
> > > > > > > >.
> > > > > > > The doc describes two approaches to calculate nested grace
> > periods,
> > > > > > points
> > > > > > > out implementation details and opens several design questions.
> > > > > > >
> > > > > > > I would highly appreciate any thoughts, ideas and suggestions!
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Alex
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to