Re: Haproxy refuses new connections when doing a reload followed by a restart

Apollon Oikonomopoulos Thu, 12 Oct 2017 07:51:36 -0700

On 16:17 Thu 12 Oct     , William Lallemand wrote:
> Hi,
> 
> On Thu, Oct 12, 2017 at 01:19:58PM +0300, Apollon Oikonomopoulos wrote:
> > The biggest issue here is that we are using a signal to trigger the 
> > reload (which is a complex, non-atomic operation) and let things settle 
> > on their own. Systemd assumes that as soon as the signal is delivered 
> > (i.e.  the ExecReload command is done), the reload is finished, while in 
> > our case the reload is finished when the old haproxy process is really 
> > dead. Using a signal to trigger the reload is handy, so we could keep 
> > that, but the wrapper would need some changes to make reloads more 
> > robust:
> > 
> >  1. It should use sd_notify(3) to communicate the start/stop/reload 
> >     status to systemd (that would also mean converting the actual 
> >     service to Type=notify). This way no other operation will be 
> >     performed on the unit until the reload is finished and the process 
> >     group is in a known-good state.
> > 
> >  2. It should handle the old process better: apart from relying on the 
> >     new haproxy process for killing the old one, it should explicitly 
> >     SIGKILL it after a given timeout if it's not dead yet and make sure 
> >     reloads are timeboxed.
> > 
> > IIUC, in 1.8 the wrapper has been replaced by the master process which 
> > seems to do point 2 above, but point 1 is something that should still be 
> > handled IMHO.
> 
> One helpful feature I read in the documentation is the usage of the
> sd_notify(..  "READY=1").  It can be useful for configuration files that takes
> time to process, for example those with a lot of ssl frontends. This signal
> could be send once the children has been forked.
> 
> It's difficult to know when a reload is completely finished (old processes
> killed) in case of long TCP sessions. So, if we use this system there is a 
> risk
> to trigger a timeout in systemd on the reload isn't it?


The Reload timeout is apparently controlled by TimeoutStartSec in 
systemd.

> feature for the reload, it should be done after the fork of the new 
> processes,
> not after the leaving of the old processes, because the processes are 
> ready to
> receive traffic at this stage.

That's true. OTOH the problem with haproxy-systemd-wrapper is that once 
it re-exec's itself it loses track of the old processes completely 
(IIRC), combined with the fact that old processes may eat up a lot of 
memory. There are cases where you would prefer breaking a long TCP 
session after 30s if it would give you back 2GB of RSS, to having the 
process lying around just for one client.

> 
> However I'm not sure it's that useful, you can know when a process is ready
> using the logs, and it will add specific code for systemd and a dependency.
> 
> Are there really advantages to letting know systemd when a reload is finished
> or when a process is ready?

Yes, there are. systemd will only perform a single operation on a unit 
at a time, and will queue up the rest. When you inform systemd that 
something (startup/reload) is in progress, it will not let any other 
action happen until the first operation is finished. Now it's trivial to 
issue a ton of reloads in a row that will leave a ton of old processes 
lying around until they terminate.

The other advantage with Type=notify services is that systemd will wait 
for READY=1 before starting units with After=haproxy (although HAProxy 
is really a "leaf" kind of service).

Note that the dependency is really thin, and you can always make it a 
compile-time option. 

Regards,
Apollon

Re: Haproxy refuses new connections when doing a reload followed by a restart

Reply via email to