Hi! > > Yes, in my usecase this would be used at the place where sd_notify() > > is used if the service runs under systemd. Then periodically executed > > watchdog could check the service makes progress and react if it > > doesn't. > > If a single notification step is enough for you, i.e. the service > goes from a "preparing" state to a "ready" state and remains ready > until the process dies, then what you want is implemented in the s6 > process supervisor: https://skarnet.org/software/s6/notifywhenup.html > > Then you can synchronously wait for service readiness > (s6-svwait $service) or, if you have a watchdog service, periodically > poll for readiness (s6-svstat -r $service). > > But that's only valid if your service can only change states once > (from "not ready" to "ready"). If you need anything more complex, s6 > won't support it intrinsically. No, I need to monitor the service is alive - my watchdog script would test if the age of the status message is older than a defined threshold in which case it would kill the service (and the rest would be handled in finish script).
> The reason why there isn't more advanced support for this in any > supervision suite (save systemd but even there it's pretty minimal) > is that service states other than "not ready yet" and "ready" are > very much service-dependent and it's impossible for a generic process > supervisor to support enough states for every possible existing service. > Daemons that need complex states usually come with their own > monitoring software that handles their specific states, with integrated > health checks etc. > > So my advice would be: > - if what you need is just readiness notification, switch to s6. > It's very similar to runit and I think you'll find it has other > benefits as well. The drawback, obviously, is that it's not in busybox > and the required effort to switch may not be worth it. > - if you need anything more complex, you can stick to runit, but you > will kinda need to write your own monitor for your daemon, because > that's what everyone does. > > Depending on the details of the monitoring you need, the monitoring > software can be implemented as another service (e.g. to receive > heartbeats from your daemon), or as a polling client (e.g. to do > periodic health checks). Both approaches are valid. That's what I thought of as well, but having this completely out of the runsv can lead to a possible race window when the watchdog can kill a service, which has restarted itself. This could be avoided if the check would be serialized with other steps (run/finish execution) within runsv. So far the futile restart of the service doesn't seem to cause problems to me, so I'm not much bothered with it. > Don't hack on runit, especially the control pipe thing. It will not > end well. > (runit's control pipe feature is super dangerous, because it allows a > service to hijack the control flow of its supervisor, which endangers > the supervisor's safety. That's why s6 does not implement it; it > provides similar - albeit slightly less powerful - control features > via ways that never give the service any power over the supervisor.) The main reason I wanted to use the service pipe for it was a possibility to see the service status in the process tree, which would be a nice benefit. BR, Petr