Hi Patrick,

On Tue, May 07, 2019 at 02:23:15PM -0400, Patrick Hemmer wrote:
> So with the prevalence of the issues lately where haproxy is going 
> unresponsive and consuming 100% CPU, I wanted to see what thoughts were 
> on implementing systemd watchdog functionality.
> 
> In our case, haproxy going unresponsive is extremely problematic as our 
> clustering software (pacemaker+systemd) sees the service still running, 
> and doesn't realize it needs to restart the service or fail over.
> We could look into implementing some sort of custom check resource in 
> pacemaker, but before going down that route I wanted to explore the 
> systemd watchdog functionality.
> 
> The watchdog is implemented by periodically sending "WATCHDOG=1" on the 
> systemd notification socket. However there are a few different ways I 
> can see this being implemented.
> 
> We could put this in the master control process, but this only tells us 
> if the master is functioning, not the workers, which are what really matter.
> 
> So the next thought would be for all of the workers to listen on a 
> shared socket. The master would periodically send a request to that 
> socket, and as long as it gets a response, it pings the watchdog. This 
> tells us that there is at least one worker able to accept traffic.
> 
> However if a frontend is bound to a specific worker, then that would 
> frontend would be non-responsive, and the watchdog wouldn't restart the 
> service. For that the worker would have to send a request to each worker 
> separately, and require a response from all of them before it pings the 
> watchdog. This would be better able to detect issues, but for some 
> people who aren't using any bound-to-process frontends, they would be 
> able to handle failure of a single worker and potentially schedule a 
> restart/reload at a less impactful time.
>

The master uses a special backend, invisible to the user, which contains 1
server per worker, it uses the socketpair of the worker for the address. They
are always connected and they can communicate. This architecture allows to
forward commands to the CLI of the worker.

One of my ideas was to do the equivalent of adding a "check" keyword for each
of these server line. We would have to implement a special check which will
send a CLI command and wait for its response.

If one of the server does not respond, we could execute the exit-on-failure
procedure.

> 
> The last idea would be to have the watchdog watch the master only, and 
> the master watches the workers in turn. If a worker stops responding, 
> the master would restart just that one worker.
> 

That's not a good idea to restart only one worker, and that's not possible with
the current architecture, and too much complicated. In my opinion it's better
to kill everything so systemd can restart properly with Restart=on-failure,
this is what is done when one of the worker segfault, for example.

> 
> Any thoughts on the matter, or do we not want to do this, and rely on a 
> custom check in the cluster management software?
> 
> -Patrick
> 

-- 
William Lallemand

Reply via email to