Hi, 

Sorry, wrong order in the answers.

> Yes it has something to do with it because it's the systemd-wrapper which
> delivers the signal to the old processes in this mode, while in the normal
> mode the processes get the signal directly from the new process. Another
> important point is that exactly *all* users having problem with zombie
> processes are systemd users, with no exception. And this problem has never
> existed over the first 15 years where systems were using a sane init
> instead and still do not exist on non-systemd OSes.

Unfortunately, I remember we had the same issue (but less frequently) on 
CentOS6 which is init-based.
I tried to reproduce, but didn't succeed... So let's ignore that for now, it 
was maybe related to something else.

> OK that's interesting. And when this happens, they stay there forever ?

Yes, these process are never stopped and are still bound to the socket.

> Ah this is getting very interesting. Maybe we should hack systemd-wrapper
> to log the signals it receives and the signals and pids it sends to see
> what is happening here. It may also be that the signal is properly sent
> but never received (but why ?).

Clearly. Apparently I sometimes have a wrong information in the pidfile...

Have a look at journald logs: 

Oct 24 12:26:57 haproxys01e02-par haproxy-systemd-wrapper[44319]: 
haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f 
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 44941
Oct 24 12:26:57 haproxys01e02-par haproxy-systemd-wrapper[44319]: [WARNING] 
297/122657 (44951) : config : 'option forwardfor' ignored for frontend 
'https-in' as it requires HTTP mode.
Oct 24 12:27:00 haproxys01e02-par haproxy-systemd-wrapper[44319]: 
haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f 
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 44952
Oct 24 12:27:00 haproxys01e02-par haproxy-systemd-wrapper[44319]: [WARNING] 
297/122700 (44978) : config : 'option forwardfor' ignored for frontend 
'https-in' as it requires HTTP mode.
Oct 24 12:27:05 haproxys01e02-par haproxy-systemd-wrapper[44319]: 
haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f 
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 44983
Oct 24 12:27:05 haproxys01e02-par haproxy-systemd-wrapper[44319]: [WARNING] 
297/122705 (45131) : config : 'option forwardfor' ignored for frontend 
'https-in' as it requires HTTP mode.
Oct 24 12:27:09 haproxys01e02-par haproxy-systemd-wrapper[44319]: 
haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f 
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 45132
Oct 24 12:27:09 haproxys01e02-par haproxy-systemd-wrapper[44319]: [WARNING] 
297/122709 (45146) : config : 'option forwardfor' ignored for frontend 
'https-in' as it requires HTTP mode.

Hopefully I've an error in my config, which let me see the process of the first 
child :).
Here we can the that: 
* 44978 references (-sf) 44952 (child of 44951)
* 45131 references 44983=nobody that we've seen in the logs... (so 44978 and 
its child will stay alive forever !)
* 45146 references 45132 (child of 45131)

> That's very kind, thank you. However I don't have access to a docker
> machine but I know some people on the list do so I hope we'll quickly
> find the cause and hopefully be able to fix it (unless it's another
> smart invention from systemd to further annoy running deamons).

> Another important point, when you say you restart every 2ms, are you
> certain you have a way to ensure that everything is completely started
> before you issue your signal to kill the old process ? 
> (..)
> So at 2ms I could easily imagine that we're delivering signals to a
> starting process, maybe even before it has the time to register a signal
> handler, and that these signals are lost before the sub-processes are
> started. 

Clearly no, my test is trivial, but as I observe the behaviour on a platform 
that operates at a different time scale (reload every 1 to 10 seconds average), 
it was just a way to reproduce the issue and be able to investigate in the 
container for ex. with gdb.

> Regards,
> Willy

Thanks !
Pierre

Reply via email to