On Wed, May 30, 2018 at 04:47:31PM +0200, William Dauchy wrote:
> Hello William L.,
> 

Hi William D. :-)

> I did some more testing:
> I simplified my config, removing the multi binding part and cpu-map.
> Conclusion is, I have this issue when I activate nbthread feature
> (meaning no probkem without).
> 
> I tried to kill -USR1 the failing worker, but it remains.
> 
> Here are the Sig* from status file of one of the failing process:
> SigQ:   0/192448
> SigPnd: 0000000000000000
> SigBlk: 0000000000000800
> SigIgn: 0000000000001800
> SigCgt: 0000000180300205
> 

I can reproduce the same situation there, however I disabled the seamless
reload. When doing a -USR1 & strace on an remaining worker, I can see that the
the signal is not blocked, and that it's still polling

My guess is that something is preventing the leaving of the worker. It tried to
gdb the threads but not one seems to be in a dead lock. I have to investigate
more. 

I'm not sure that's related at all with the timing of the reload but I could be
wrong.

> About the timing of reload, it seems to take a few seconds most of the
> time, so I *think* I am not reloading before another is not yet done,
> but I would appreciate whether I can check this fact through a file
> before sending the reload; do you have any hint?

I think systemd is not trying to reload when a reload is not finished yet
with Type=notify. You could 'grep reloading' on the systemctl status haproxy to
check that.

Unfortunately the only way to know when the service is ready is with systemd,
but I planned to make the status available on the stats socket in the future.

-- 
William Lallemand

Reply via email to