Got it, thanks Zameer!

Thanks,
Qian Zhang

On Tue, Oct 18, 2016 at 2:25 AM, Zameer Manji <zma...@apache.org> wrote:

> Qian,
>
> Turns out the --checkpoint flag was made default and removed in Mesos 0.22.
>
> On Sun, Oct 16, 2016 at 4:38 PM, Qian Zhang <zhq527...@gmail.com> wrote:
>
>> and requires operators to enable checkpointing on the slaves.
>>
>>
>> Just curious why operator needs to enable checkpointing on the slaves (I
>> do not see an agent flag for that), I think checkpointing should be enabled
>> in framework level rather than slave.
>>
>>
>> Thanks,
>> Qian Zhang
>>
>> On Sun, Oct 16, 2016 at 10:18 AM, Zameer Manji <zma...@apache.org> wrote:
>>
>>> +1 to A and B
>>>
>>> Aurora has enabled checkpointing for years and requires operators to
>>> enable
>>> checkpointing on the slaves.
>>>
>>> On Sat, Oct 15, 2016 at 11:57 AM, Joris Van Remoortere <
>>> jo...@mesosphere.io>
>>> wrote:
>>>
>>> > I'm in favor of A & B. I find it provides a better "first experience"
>>> to
>>> > users.
>>> > From my experience you usually have to have an explicit reason to not
>>> want
>>> > to checkpoint. Most people assume the semantics provided by the
>>> checkpoint
>>> > behavior is default and it can be a frustrating experience for them to
>>> find
>>> > out that is not the case.
>>> >
>>> > —
>>> > *Joris Van Remoortere*
>>>
>>> > Mesosphere
>>> >
>>> > On Fri, Oct 14, 2016 at 3:11 PM, Neil Conway <neil.con...@gmail.com>
>>> > wrote:
>>> >
>>> >> Hi folks,
>>> >>
>>> >> I'd like input from individuals who currently use frameworks but do
>>> >> not enable checkpointing.
>>> >>
>>> >> Background: "checkpointing" is a parameter that can be enabled in
>>> >> FrameworkInfo; if enabled, the agent will write the framework pid,
>>> >> executor PIDs, and status updates to disk for any tasks started by
>>> >> that framework. This checkpointed information means that these tasks
>>> >> can survive an agent crash: if the agent exits (whether due to
>>> >> crashing or as part of an upgrade procedure), a restarted agent can
>>> >> use this information to reconnect to executors started by the previous
>>> >> instance of the agent. The downside is that checkpointing requires
>>> >> some additional disk I/O at the agent.
>>> >>
>>> >> Checkpointing is not currently the default, but in my experience it is
>>> >> often enabled for production frameworks. As part of the work on
>>> >> supporting partition-aware Mesos frameworks (see MESOS-4049), we are
>>> >> considering:
>>> >>
>>> >> (a) requiring that partition-aware frameworks must also enable
>>> >> checkpointing, and/or
>>> >> (b) enabling checkpointing by default
>>> >>
>>> >> If you have intentionally decided to disable checkpointing for your
>>> >> Mesos framework, I'd be curious to hear more about your use-case and
>>> >> why you haven't enabled it.
>>> >>
>>> >> Thanks!
>>> >>
>>> >> Neil
>>> >>
>>> >> --
>>> >> Zameer Manji
>>> >>
>>> >
>>>
>>> --
>>> Zameer Manji
>>>
>>

Reply via email to