Re: Non-checkpointing frameworks

Joris Van Remoortere Sat, 15 Oct 2016 11:57:43 -0700

I'm in favor of A & B. I find it provides a better "first experience" to
users.
>From my experience you usually have to have an explicit reason to not want
to checkpoint. Most people assume the semantics provided by the checkpoint
behavior is default and it can be a frustrating experience for them to find
out that is not the case.


—
*Joris Van Remoortere*
Mesosphere

On Fri, Oct 14, 2016 at 3:11 PM, Neil Conway <[email protected]> wrote:

> Hi folks,
>
> I'd like input from individuals who currently use frameworks but do
> not enable checkpointing.
>
> Background: "checkpointing" is a parameter that can be enabled in
> FrameworkInfo; if enabled, the agent will write the framework pid,
> executor PIDs, and status updates to disk for any tasks started by
> that framework. This checkpointed information means that these tasks
> can survive an agent crash: if the agent exits (whether due to
> crashing or as part of an upgrade procedure), a restarted agent can
> use this information to reconnect to executors started by the previous
> instance of the agent. The downside is that checkpointing requires
> some additional disk I/O at the agent.
>
> Checkpointing is not currently the default, but in my experience it is
> often enabled for production frameworks. As part of the work on
> supporting partition-aware Mesos frameworks (see MESOS-4049), we are
> considering:
>
> (a) requiring that partition-aware frameworks must also enable
> checkpointing, and/or
> (b) enabling checkpointing by default
>
> If you have intentionally decided to disable checkpointing for your
> Mesos framework, I'd be curious to hear more about your use-case and
> why you haven't enabled it.
>
> Thanks!
>
> Neil
>

Re: Non-checkpointing frameworks

Reply via email to