On Sat, Nov 20, 2021 at 9:57 PM Laurent Bercot <ska-skaw...@skarnet.org> wrote: > > > >I have a problem process that's getting stuck somewhere in its network > >code (clearly the right answer is to find/fix the problem, but it's > >infrequent and we're up against time constraints...) > > > >I'm looking at just adding a wrapper around it which does something > >along the lines of the existing s6 notification - have the supervised > >process write a regular character to an inherited fd, if it ever stops > >then the wrapper kills everyone and exits allowing the supervision to > >restart it. > > What you want is process monitoring. s6 doesn't do that, because > the monitoring needs are very process-specific and it's impossible to > predict all the functionality that every daemon under the sun could > want to use. >
Indeed, wasn't sure if I was missing something! > Here, you need a heartbeat. You can implement the heartbeat monitor > as a separate service, which it is: it's a service that reads the > heartbeat from your daemon and sends it an s6-svc -r command when > it fails to receive the heartbeat from a period of time. > If you don't need anything more than that, you can probably write > the heartbeat monitor in shell (or even in execline!) It's a call > to mkfifo then a read loop. :) > Thanks for the steer on the pattern. -- Alex Kiernan