Re: Watchdog wrapper for s6

2021-11-21 Thread Alex Kiernan
On Sat, Nov 20, 2021 at 9:57 PM Laurent Bercot  wrote:
>
>
> >I have a problem process that's getting stuck somewhere in its network
> >code (clearly the right answer is to find/fix the problem, but it's
> >infrequent and we're up against time constraints...)
> >
> >I'm looking at just adding a wrapper around it which does something
> >along the lines of the existing s6 notification - have the supervised
> >process write a regular character to an inherited fd, if it ever stops
> >then the wrapper kills everyone and exits allowing the supervision to
> >restart it.
>
>   What you want is process monitoring. s6 doesn't do that, because
> the monitoring needs are very process-specific and it's impossible to
> predict all the functionality that every daemon under the sun could
> want to use.
>

Indeed, wasn't sure if I was missing something!

>   Here, you need a heartbeat. You can implement the heartbeat monitor
> as a separate service, which it is: it's a service that reads the
> heartbeat from your daemon and sends it an s6-svc -r command when
> it fails to receive the heartbeat from a period of time.
>   If you don't need anything more than that, you can probably write
> the heartbeat monitor in shell (or even in execline!) It's a call
> to mkfifo then a read loop. :)
>

Thanks for the steer on the pattern.

-- 
Alex Kiernan


Re: Watchdog wrapper for s6

2021-11-20 Thread Laurent Bercot




I have a problem process that's getting stuck somewhere in its network
code (clearly the right answer is to find/fix the problem, but it's
infrequent and we're up against time constraints...)

I'm looking at just adding a wrapper around it which does something
along the lines of the existing s6 notification - have the supervised
process write a regular character to an inherited fd, if it ever stops
then the wrapper kills everyone and exits allowing the supervision to
restart it.


 What you want is process monitoring. s6 doesn't do that, because
the monitoring needs are very process-specific and it's impossible to
predict all the functionality that every daemon under the sun could
want to use.

 Here, you need a heartbeat. You can implement the heartbeat monitor
as a separate service, which it is: it's a service that reads the
heartbeat from your daemon and sends it an s6-svc -r command when
it fails to receive the heartbeat from a period of time.
 If you don't need anything more than that, you can probably write
the heartbeat monitor in shell (or even in execline!) It's a call
to mkfifo then a read loop. :)

--
 Laurent



Watchdog wrapper for s6

2021-11-20 Thread Alex Kiernan
I have a problem process that's getting stuck somewhere in its network
code (clearly the right answer is to find/fix the problem, but it's
infrequent and we're up against time constraints...)

I'm looking at just adding a wrapper around it which does something
along the lines of the existing s6 notification - have the supervised
process write a regular character to an inherited fd, if it ever stops
then the wrapper kills everyone and exits allowing the supervision to
restart it.

My google-fu has failed to turn up anything which does this, am I just
failing to craft the right query, or is this just a mad/bad idea? Or
does anyone have a pointer?

-- 
Alex Kiernan