Re: Application supervisor: sync point manager

Laurent Bercot Fri, 19 Aug 2016 14:46:07 -0700

On 19/08/2016 20:04, Lionel Van Bemten wrote:

What are possible reasons for s6-supervise to die?


 Normally, it won't die. But the main point of supervision is to prepare
for abnormal situations and have your services running even if the world
is crumbling around them. s6-supervise may get a stray signal sent by the
OOM killer, for instance. Or a misconfigured program may send it a
s6-svc -x or a SIGTERM. Stranger things have happened. s6-supervise will be
restarted by s6-svscan, but if you have a down file, it won't restart the
service; that breaks the guarantee that your services are kept up by the
supervision tree.
 Of course, it's very possible that the first random death of a
s6-supervise process happens in a few millenia. I very much hope it's the
case.

After reboot of my system I want the down files to be there.


 Of course. That's one of the reasons why I usually recommend to keep
your servicedir repository in an immutable place (typically a read-only
root filesystem) and *copy* the service directories you need into a
tmpfs during the early boot, and work with the copies.

 s6-rc-init does this: it copies the service directories from the compiled
database into the live directory, and adds down files to the copies before
linking them into the scandir.

Also I would be curious to know you opinion about solution 3, i.e.
having a supervised process (spm) fork into s6-svscan. Do you see that
as a valid approach?


 That's valid, in the sense that it works; but it needlessly adds complexity
to the setup. Generally speaking, unless you are working with different
sets of privileges (and, for instance, want a supervision tree to run under
a specific uid), you never need to have several supervision trees, or to
nest them; as far as my experience goes, you can always work with a unique
s6-svscan instance and a flat tree.

 Having spm fork its own supervision tree is more complex because:
 - you need spm to perform supervision duties on its s6-svscan child, lest
you break the supervision tree guarantee: so you need to duplicate the
supervision functionality into spm.
 - you need two separate scandirs. When looking for a service, you have
no automatic way (other than your policy) of knowing which tree the service
is supervised by. Is my servicedir /rootscandir/foobar or
/rootscandir/spm/spmscandir/foobar?
 - any automation of your service management (à la s6-rc) needs to take
your setup into account.
 - you use marginally more resources (one s6-svscan process plus the
supervision code in spm).

 Since you already have an operational supervision tree when you run spm,
you can have spm reuse the same one for its own operations: just give it
the scandir as an argument or a config variable, and it will be able to
operate on it. It doesn't matter if it's the same scandir spm itself is
supervised in.

--
 Laurent

Re: Application supervisor: sync point manager

Reply via email to