Re: MESOS-6233 Allow agents to re-register post a host reboot

tommy xiao Tue, 29 Nov 2016 14:25:07 -0800

agree with james's options.

2016-11-30 0:48 GMT+08:00 James Peach <[email protected]>:


>
> > On Nov 28, 2016, at 6:09 PM, Yan Xu <[email protected]> wrote:
> >
> > So one thing that was brought up during offline conversations was that
> if the host reboot is associated with hardware change (e.g., a new memory
> stick):
> >
> >       • Currently: the agent would skip the recovery (and the chance of
> running into incompatible agent info) and register as a new agent.
> >       • With the change: the agent could run into incompatible agent
> info due to resource change and flap indefinitely until the operator
> intervenes.
> >
> > To mitigate this and maintain the current behavior, we can have the
> agent remove `rm -f <work_dir>/meta/slaves/latest` automatically upon
> recovery failure but only after the host has rebooted. This way the agent
> can restart as a new agent without operator intervention.
> >
> > Any thoughts?
>
> I still think you need a mechanism for the master/agent to tell you
> whether it will honor the restart policy. Without this, you have to lock
> the framework to a Mesos version.
>
> An empty RestartPolicy is also problematic since it precludes using
> RestartPolicy in pods. If you later want to restart a task inside a pod but
> not across agent restarts you would have no way to express that.
>
> J




-- 
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com

Re: MESOS-6233 Allow agents to re-register post a host reboot

Reply via email to