Re: Machine agents uninstall themselves upon worker.ErrTerminateAgent.

William Reade Mon, 09 May 2016 02:59:40 -0700

On Mon, May 9, 2016 at 11:03 AM, Andrew Wilkins <
andrew.wilk...@canonical.com> wrote:


Meta: a name like "uninstall-agent" is really misleading if it actually
>> means "machine-X is definitely dead". I never had the slightest indication
>> that was what it was meant to mean. But in that case, why *don't* we write
>> the file on certain API connection failures? There is logic that
>> *explicitly* checks if the machine is Dead -- surely that's a trigger too?
>>
>
> We do, but it looks like we've regressed since the MADE branch landed.
> api.ErrConnectImpossible now causes uninstall-agent, which means a
> (possibly temporary) authorization error will once again cause an uninstall:
>
> https://github.com/juju/juju/blob/master/worker/apicaller/connect.go#L176
>

Yeah, we discussed that at the time and evidently something went wrong -- I
was still misconceiving uninstall-agent as a manual-specific,
provision-time-only switch, which had been incorrectly overloaded with
is-it-dead-no-matter-the-provider meaning.


>  (1) record (in agent config?) that a machine is manual
>>>  (2) only ever do anything uninstall-related for manual machines
>>>  (3) only ever do uninstall-related things if the machine actually is
>>> Dead
>>>  (4) drop lxc-specific logic from uninstall *when LXC support is removed*
>>>
>>
>> Generally SGTM, but confused re (4) -- doesn't that have to be
>> fixed/moved/removed first earlier, so that we can switch off the suicide
>> behaviour in other cases without regressing?
>>
>
> We can remove the LXC/loop bits now if we're fine with leaking loop
> devices, which is probably OK assuming we are actually removing LXC before
> 2.0. Loop devices has to be explicitly enabled anyway.
>

Sounds like a plan. Agent config seems like a decent place to store
manual-ness; and having a specific ErrAgentEntityDead STM to be the best
signal for triggering a check and potential uninstall.

(in fact: for great safety, it can and probably should be
cmd/jujud/agent.errEntityDead, which gets set in a manifold- or
engine-config Filter field -- so it only ever gets injected in response to
specific errors from specific workers. It's unquestionably the agent's
decision to make; and it's the workers' responsibility to return precise
errors, rooted in their own contexts, that don't prejudge how the client
might want to respond.)

Cheers
William

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev

Re: Machine agents uninstall themselves upon worker.ErrTerminateAgent.

Reply via email to