Hi Pierre,

On Fri, Oct 21, 2016 at 03:05:55PM +0000, Pierre Cheynier wrote:
> First let's clarify again: we are on systemd-based OS (centOS7), so reload is
> done by sending SIGUSR2 to haproxy-systemd-wrapper.
> Theoretically, this has absolutely no relation with our current issue (if I
> understand well the way the old process are managed)

Yes it has something to do with it because it's the systemd-wrapper which
delivers the signal to the old processes in this mode, while in the normal
mode the processes get the signal directly from the new process. Another
important point is that exactly *all* users having problem with zombie
processes are systemd users, with no exception. And this problem has never
existed over the first 15 years where systems were using a sane init
instead and still do not exist on non-systemd OSes.

> This happens on servers with live traffic, but with a reasonable amount of
> connections. I'm also able to reproduce with no connections, but I've to be a
> bit more aggressive with the reloads frequency (probably because children are
> faster to die).

OK that's interesting. And when this happens, they stay there forever ?

> For me the problem is not that we still have connections or not, it is that
> in this case some old processes are never "aware" that they should die, so
> they continues to listen for incoming requests, thanks to SO_REUSEPORT.
> Consequently, you end up with N process listening with different configs.

Ah this is getting very interesting. Maybe we should hack systemd-wrapper
to log the signals it receives and the signals and pids it sends to see
what is happening here. It may also be that the signal is properly sent
but never received (but why ?).

> In the pstree I pasted in the previous message, there are 3 minutes between
> the first living instance and the last (and as you can see, we are quite
> aggressive with long connections) :
> 
>      timeout client 2s
>      timeout server 5s
>      timeout connect 200ms
>      timeout http-keep-alive 200ms
> 
> Here is a Dockerfile that can be used to reproduce (where I use 
> haproxy-systemd-wrapper, just run with default settings - ie nb of 
> reloads=300 and interval between each=2ms -) :
> 
> https://github.com/pierrecdn/haproxy-reload-issue
> 
> docker build -t haproxy-reload-issue . && docker run --rm -ti 
> haproxy-reload-issue

That's very kind, thank you. However I don't have access to a docker
machine but I know some people on the list do so I hope we'll quickly
find the cause and hopefully be able to fix it (unless it's another
smart invention from systemd to further annoy running deamons).

Another important point, when you say you restart every 2ms, are you
certain you have a way to ensure that everything is completely started
before you issue your signal to kill the old process ? I'm asking because
thanks to the principle that the wrapper must stay in foreground (smart
design choice from systemd), there's no way for a service manager to
know whether all processes are fully started or not. With a normal init,
when the process returns, all sub-processes have been created.

So at 2ms I could easily imagine that we're delivering signals to a
starting process, maybe even before it has the time to register a signal
handler, and that these signals are lost before the sub-processes are
started. Of course that's just a guess, but I don't see a clean way to
work around this, except of course by switching back to a reliable
service manager :-/

Regards,
Willy


Reply via email to