Re: Watchdog wrapper for s6
On Sat, Nov 20, 2021 at 9:57 PM Laurent Bercot wrote: > > > >I have a problem process that's getting stuck somewhere in its network > >code (clearly the right answer is to find/fix the problem, but it's > >infrequent and we're up against time constraints...) > > > >I'm looking at just adding a wrapper around it which does something > >along the lines of the existing s6 notification - have the supervised > >process write a regular character to an inherited fd, if it ever stops > >then the wrapper kills everyone and exits allowing the supervision to > >restart it. > > What you want is process monitoring. s6 doesn't do that, because > the monitoring needs are very process-specific and it's impossible to > predict all the functionality that every daemon under the sun could > want to use. > Indeed, wasn't sure if I was missing something! > Here, you need a heartbeat. You can implement the heartbeat monitor > as a separate service, which it is: it's a service that reads the > heartbeat from your daemon and sends it an s6-svc -r command when > it fails to receive the heartbeat from a period of time. > If you don't need anything more than that, you can probably write > the heartbeat monitor in shell (or even in execline!) It's a call > to mkfifo then a read loop. :) > Thanks for the steer on the pattern. -- Alex Kiernan
Re: Watchdog wrapper for s6
I have a problem process that's getting stuck somewhere in its network code (clearly the right answer is to find/fix the problem, but it's infrequent and we're up against time constraints...) I'm looking at just adding a wrapper around it which does something along the lines of the existing s6 notification - have the supervised process write a regular character to an inherited fd, if it ever stops then the wrapper kills everyone and exits allowing the supervision to restart it. What you want is process monitoring. s6 doesn't do that, because the monitoring needs are very process-specific and it's impossible to predict all the functionality that every daemon under the sun could want to use. Here, you need a heartbeat. You can implement the heartbeat monitor as a separate service, which it is: it's a service that reads the heartbeat from your daemon and sends it an s6-svc -r command when it fails to receive the heartbeat from a period of time. If you don't need anything more than that, you can probably write the heartbeat monitor in shell (or even in execline!) It's a call to mkfifo then a read loop. :) -- Laurent
Watchdog wrapper for s6
I have a problem process that's getting stuck somewhere in its network code (clearly the right answer is to find/fix the problem, but it's infrequent and we're up against time constraints...) I'm looking at just adding a wrapper around it which does something along the lines of the existing s6 notification - have the supervised process write a regular character to an inherited fd, if it ever stops then the wrapper kills everyone and exits allowing the supervision to restart it. My google-fu has failed to turn up anything which does this, am I just failing to craft the right query, or is this just a mad/bad idea? Or does anyone have a pointer? -- Alex Kiernan