Re: remaining process after (seamless) reload

2018-06-22 Thread William Dauchy
On Fri, Jun 22, 2018 at 2:28 PM William Lallemand wrote: > The seamless reload only transfers the listeners, the remaining connections > are > handled in the leaving process. It would be complicated to handle them in a > new > process with a different configuration. Thanks for the

Re: remaining process after (seamless) reload

2018-06-22 Thread Willy Tarreau
On Fri, Jun 22, 2018 at 02:41:44PM +0200, Christopher Faulet wrote: > You're right, removing the thread from all_threads_mask when it exits is a > good idea. Here a patch to do so. merged as well, thanks! Willy

Re: remaining process after (seamless) reload

2018-06-22 Thread Christopher Faulet
Le 20/06/2018 à 18:29, Willy Tarreau a écrit : On Wed, Jun 20, 2018 at 04:42:58PM +0200, Christopher Faulet wrote: When HAProxy is shutting down, it exits the polling loop when there is no jobs anymore (jobs == 0). When there is no thread, it works pretty well, but when HAProxy is started with

Re: remaining process after (seamless) reload

2018-06-22 Thread William Lallemand
On Fri, Jun 22, 2018 at 12:03:22PM +0200, William Dauchy wrote: > On Thu, Jun 21, 2018 at 5:21 PM William Lallemand > wrote: > > Once you are sure this is not a bug and that a client is still connected, > > you > > could use the keyword 'hard-stop-after' to force a hard stop. > > After double

Re: remaining process after (seamless) reload

2018-06-22 Thread William Dauchy
On Thu, Jun 21, 2018 at 5:21 PM William Lallemand wrote: > Once you are sure this is not a bug and that a client is still connected, you > could use the keyword 'hard-stop-after' to force a hard stop. After double checking some cases, indeed there are still a few remaining established

Re: remaining process after (seamless) reload

2018-06-22 Thread Willy Tarreau
On Thu, Jun 21, 2018 at 05:20:19PM +0200, William Lallemand wrote: > I'm waiting a few days for your feedback and we will probably release a new > 1.8 > version including those fixes. I've merged it in master now. Willy

Re: remaining process after (seamless) reload

2018-06-21 Thread William Lallemand
On Thu, Jun 21, 2018 at 05:10:35PM +0200, William Dauchy wrote: > On Thu, Jun 21, 2018 at 5:03 PM William Lallemand > wrote: > > Maybe one client was still connected on a frontend (including the stats > > socket). > > The process first unbind the listeners, and then wait for all clients to > >

Re: remaining process after (seamless) reload

2018-06-21 Thread William Dauchy
On Thu, Jun 21, 2018 at 5:03 PM William Lallemand wrote: > Maybe one client was still connected on a frontend (including the stats > socket). > The process first unbind the listeners, and then wait for all clients to > leave. > It's difficult to see what's going on since the stats socket is

Re: remaining process after (seamless) reload

2018-06-21 Thread William Lallemand
On Thu, Jun 21, 2018 at 04:47:50PM +0200, William Dauchy wrote: > Hi Christopher, > > A quick followup from this morning. > > On Thu, Jun 21, 2018 at 10:41 AM William Dauchy wrote: > > it seems better now, but not completely gone, in a way, I think we now > > have a new issue. > > this morning,

Re: remaining process after (seamless) reload

2018-06-21 Thread William Dauchy
Hi Christopher, A quick followup from this morning. On Thu, Jun 21, 2018 at 10:41 AM William Dauchy wrote: > it seems better now, but not completely gone, in a way, I think we now > have a new issue. > this morning, on one test machine I have a process which remains polling > traffic so it

Re: remaining process after (seamless) reload

2018-06-21 Thread William Dauchy
Hello Christopher, Thanks for the followup patch. On Wed, Jun 20, 2018 at 04:42:58PM +0200, Christopher Faulet wrote: > Hum, ok, forget the previous patch. Here is a second try. It solves the same > bug using another way. In this patch, all threads must enter in the sync > point to exit. I hope

Re: remaining process after (seamless) reload

2018-06-20 Thread Willy Tarreau
On Wed, Jun 20, 2018 at 04:42:58PM +0200, Christopher Faulet wrote: > When HAProxy is shutting down, it exits the polling loop when there is no jobs > anymore (jobs == 0). When there is no thread, it works pretty well, but when > HAProxy is started with several threads, a thread can decide to exit

Re: remaining process after (seamless) reload

2018-06-20 Thread Christopher Faulet
Le 20/06/2018 à 15:11, William Dauchy a écrit : Hello Christopher, Thank you for the quick answer and the patch. On Wed, Jun 20, 2018 at 11:32 AM Christopher Faulet wrote: Here is a patch to avoid a thread to exit its polling loop while others are waiting in the sync point. It is a

Re: remaining process after (seamless) reload

2018-06-20 Thread William Dauchy
Hello Christopher, Thank you for the quick answer and the patch. On Wed, Jun 20, 2018 at 11:32 AM Christopher Faulet wrote: > Here is a patch to avoid a thread to exit its polling loop while others > are waiting in the sync point. It is a theoretical patch because I was > not able to reproduce

Re: remaining process after (seamless) reload

2018-06-20 Thread Christopher Faulet
Le 19/06/2018 à 16:42, William Dauchy a écrit : On Tue, Jun 19, 2018 at 4:30 PM William Lallemand wrote: That's interesting, we can suppose that this bug is not related anymore to the signal problem we had previously. Looks like it's blocking in the thread sync point. Are you able to do a

Re: remaining process after (seamless) reload

2018-06-19 Thread William Dauchy
On Tue, Jun 19, 2018 at 4:30 PM William Lallemand wrote: > That's interesting, we can suppose that this bug is not related anymore to the > signal problem we had previously. > Looks like it's blocking in the thread sync point. > Are you able to do a backtrace with gdb? that could help a lot.

Re: remaining process after (seamless) reload

2018-06-19 Thread William Lallemand
On Tue, Jun 19, 2018 at 04:09:51PM +0200, William Dauchy wrote: > Hello William, > > Not much progress on my side, apart from the fact I forgot to mention > where the process are now stuck using all the cpu, in > src/hathreads.c:112 > while (*barrier != all_threads_mask) > pl_cpu_relax(); >

Re: remaining process after (seamless) reload

2018-06-19 Thread William Dauchy
Hello William, Not much progress on my side, apart from the fact I forgot to mention where the process are now stuck using all the cpu, in src/hathreads.c:112 while (*barrier != all_threads_mask) pl_cpu_relax(); -- William

Re: remaining process after (seamless) reload

2018-06-15 Thread William Dauchy
Hello, Thanks for your answer. Here are the information requested. On Fri, Jun 15, 2018 at 11:22 AM William Lallemand wrote: > - haproxy -vv HA-Proxy version 1.8.9-83616ec 2018/05/18 Copyright 2000-2018 Willy Tarreau Build options : TARGET = linux2628 CPU = generic CC = gcc

Re: remaining process after (seamless) reload

2018-06-15 Thread William Lallemand
On Tue, Jun 12, 2018 at 04:56:24PM +0200, William Dauchy wrote: > On Tue, Jun 12, 2018 at 04:33:43PM +0200, William Lallemand wrote: > > Those processes are still using a lot of CPU... > > Are they still delivering traffic? > > they don't seem to handle any traffic (at least I can't see it

Re: remaining process after (seamless) reload

2018-06-12 Thread William Dauchy
On Tue, Jun 12, 2018 at 04:33:43PM +0200, William Lallemand wrote: > Those processes are still using a lot of CPU... > Are they still delivering traffic? they don't seem to handle any traffic (at least I can't see it through strace) but that's the main difference here, using lots of CPU. > >

Re: remaining process after (seamless) reload

2018-06-12 Thread William Lallemand
On Tue, Jun 12, 2018 at 04:00:25PM +0200, William Dauchy wrote: > Hello William L, > > On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote: > > That's great news! > > > > Here's the new patches. It shouldn't change anything to the fix, it only > > changes the sigprocmask to

Re: remaining process after (seamless) reload

2018-06-12 Thread William Dauchy
Hello William L, On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote: > That's great news! > > Here's the new patches. It shouldn't change anything to the fix, it only > changes the sigprocmask to pthread_sigmask. In fact, I now have a different but similar issue. root 18547

Re: remaining process after (seamless) reload

2018-06-08 Thread Willy Tarreau
On Fri, Jun 08, 2018 at 06:22:39PM +0200, William Dauchy wrote: > On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote: > > That's great news! > > > > Here's the new patches. It shouldn't change anything to the fix, it only > > changes the sigprocmask to pthread_sigmask. > > thanks,

Re: remaining process after (seamless) reload

2018-06-08 Thread William Dauchy
On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote: > That's great news! > > Here's the new patches. It shouldn't change anything to the fix, it only > changes the sigprocmask to pthread_sigmask. thanks, I attached the backport for 1.8 and started a new test with them. Feel free to

Re: remaining process after (seamless) reload

2018-06-08 Thread William Lallemand
On Fri, Jun 08, 2018 at 06:20:21PM +0200, Willy Tarreau wrote: > On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote: > > That's great news! > > > > Here's the new patches. It shouldn't change anything to the fix, it only > > changes the sigprocmask to pthread_sigmask. > > OK, I

Re: remaining process after (seamless) reload

2018-06-08 Thread Willy Tarreau
On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote: > That's great news! > > Here's the new patches. It shouldn't change anything to the fix, it only > changes the sigprocmask to pthread_sigmask. OK, I can merge them right now if you want. At the very least it will kill a whole

Re: remaining process after (seamless) reload

2018-06-08 Thread William Lallemand
On Fri, Jun 08, 2018 at 02:10:44PM +0200, William Dauchy wrote: > On Thu, Jun 07, 2018 at 11:50:45AM +0200, William Lallemand wrote: > > Sorry for the late reply, I manage to reproduce and fix what seams to be > > the bug. > > The signal management was not handled correctly with threads. > >

Re: remaining process after (seamless) reload

2018-06-08 Thread William Dauchy
Hello William L., On Thu, Jun 07, 2018 at 11:50:45AM +0200, William Lallemand wrote: > Sorry for the late reply, I manage to reproduce and fix what seams to be the > bug. > The signal management was not handled correctly with threads. > Could you try those patches and see if it fixes the

Re: remaining process after (seamless) reload

2018-06-08 Thread William Lallemand
On Thu, Jun 07, 2018 at 12:02:46PM +0200, Willy Tarreau wrote: > On Thu, Jun 07, 2018 at 11:50:45AM +0200, William Lallemand wrote: > > /* block signal delivery during processing */ > > +#ifdef USE_THREAD > > + pthread_sigmask(SIG_SETMASK, _sig, _sig); > > +#else > >

Re: remaining process after (seamless) reload

2018-06-07 Thread Willy Tarreau
On Thu, Jun 07, 2018 at 11:50:45AM +0200, William Lallemand wrote: > /* block signal delivery during processing */ > +#ifdef USE_THREAD > + pthread_sigmask(SIG_SETMASK, _sig, _sig); > +#else > sigprocmask(SIG_SETMASK, _sig, _sig); > +#endif I think for the merge we'd rather put a

Re: remaining process after (seamless) reload

2018-06-07 Thread William Lallemand
Hi guys, Sorry for the late reply, I manage to reproduce and fix what seams to be the bug. The signal management was not handled correctly with threads. Could you try those patches and see if it fixes the problem? Thanks. -- William Lallemand >From d695242fb260538bd8db323715d627c4a9deacc7

Re: remaining process after (seamless) reload

2018-05-30 Thread William Lallemand
On Wed, May 30, 2018 at 07:57:03PM +0200, Tim Düsterhus wrote: > William, > > Am 30.05.2018 um 19:45 schrieb William Lallemand: > >> @William Lallemand Possibly the sd_notifyf should be moved below > >> mworker_unblock_signals in mworker_wait? > >> > > > > This shouldn't happen with or without

Re: remaining process after (seamless) reload

2018-05-30 Thread William Dauchy
On Wed, May 30, 2018 at 5:29 PM, William Lallemand wrote: > I can reproduce the same situation there, however I disabled the seamless > reload. When doing a -USR1 & strace on an remaining worker, I can see that the > the signal is not blocked, and that it's still polling good news! >

Re: remaining process after (seamless) reload

2018-05-30 Thread Tim Düsterhus
William, Am 30.05.2018 um 19:45 schrieb William Lallemand: >> @William Lallemand Possibly the sd_notifyf should be moved below >> mworker_unblock_signals in mworker_wait? >> > > This shouldn't happen with or without systemd. I can reproduce it without > using systemd, we should not rely on an

Re: remaining process after (seamless) reload

2018-05-30 Thread William Lallemand
Hi Tim, On Tue, May 29, 2018 at 09:33:48PM +0200, Tim Düsterhus wrote: > > @William Lallemand Possibly the sd_notifyf should be moved below > mworker_unblock_signals in mworker_wait? > This shouldn't happen with or without systemd. I can reproduce it without using systemd, we should not rely

Re: remaining process after (seamless) reload

2018-05-30 Thread William Lallemand
On Wed, May 30, 2018 at 04:47:31PM +0200, William Dauchy wrote: > Hello William L., > Hi William D. :-) > I did some more testing: > I simplified my config, removing the multi binding part and cpu-map. > Conclusion is, I have this issue when I activate nbthread feature > (meaning no probkem

Re: remaining process after (seamless) reload

2018-05-30 Thread William Dauchy
Hello William L., I did some more testing: I simplified my config, removing the multi binding part and cpu-map. Conclusion is, I have this issue when I activate nbthread feature (meaning no probkem without). I tried to kill -USR1 the failing worker, but it remains. Here are the Sig* from status

Re: remaining process after (seamless) reload

2018-05-29 Thread Willy Tarreau
On Tue, May 29, 2018 at 08:35:19PM +0200, William Dauchy wrote: > I however don't see on which part haproxy would > need to do dns lookup on our side. Front end side is host matching and > backend side is IP only. We studied the possibility that a reload happends at the exact moment the config

Re: remaining process after (seamless) reload

2018-05-29 Thread William Dauchy
Hello Tim, On Tue, May 29, 2018 at 9:33 PM, Tim Düsterhus wrote: > Run systemctl status haproxy. It shows the status: > >> [timwolla@/s/haproxy (maxrewrite-warn)]sudo systemctl status haproxy >> ● haproxy.service - HAProxy Load Balancer >>Loaded: loaded (/lib/systemd/system/haproxy.service;

Re: remaining process after (seamless) reload

2018-05-29 Thread Tim Düsterhus
William, Am 29.05.2018 um 17:09 schrieb William Dauchy: > We are using Type=notify. I however cannot guarantee we do not trigger > a new reload, before the previous one is done. Is there a way to check > the "ready" state you mentioned? Run systemctl status haproxy. It shows the status: >

Re: remaining process after (seamless) reload

2018-05-29 Thread William Dauchy
Hello Dave, On Tue, May 29, 2018 at 5:55 PM, Dave Chiluk wrote: > We've battled the same issue with our haproxys. We root caused it to slow > dns lookup times while parsing the config was causing haproxy config parsing > to take so long that we were attempting to reload again before the

Re: remaining process after (seamless) reload

2018-05-29 Thread Dave Chiluk
We've battled the same issue with our haproxys. We root caused it to slow dns lookup times while parsing the config was causing haproxy config parsing to take so long that we were attempting to reload again before the original reload had completed. I'm still not sure why or where the Signals are

Re: remaining process after (seamless) reload

2018-05-29 Thread William Dauchy
Hello William, Sorry for the last answer. > Are the problematical workers leaving when you reload a second time? no, they seems to stay for a long time (forever?) > Did you try to kill -USR1 the worker ? It should exits with "Former worker > $PID > exited with code 0" on stderr. > If not,

Re: remaining process after (seamless) reload

2018-05-28 Thread William Lallemand
On Thu, May 24, 2018 at 11:00:29PM +0200, William Dauchy wrote: > On Thu, May 24, 2018 at 12:01:38PM +0200, William Lallemand wrote: > > I managed to reproduce something similar with the 1.8.8 version. It looks > > like > > letting a socat connected to the socket helps. > > > > I'm looking into

Re: remaining process after (seamless) reload

2018-05-24 Thread William Dauchy
Hi William, Thank you for your reply. On Thu, May 24, 2018 at 12:01:38PM +0200, William Lallemand wrote: > I managed to reproduce something similar with the 1.8.8 version. It looks like > letting a socat connected to the socket helps. > > I'm looking into the code to see what's happening.

Re: remaining process after (seamless) reload

2018-05-24 Thread William Lallemand
On Thu, May 24, 2018 at 10:07:23AM +0200, William Dauchy wrote: > On Wed, May 23, 2018 at 08:45:04PM +0200, William Dauchy wrote: > > More details which could help understand what is going on: > > > > ps output: > > > > root 15928 0.3 0.0 255216 185268 ? Ss May21 10:11 > >

Re: remaining process after (seamless) reload

2018-05-24 Thread William Dauchy
On Wed, May 23, 2018 at 08:45:04PM +0200, William Dauchy wrote: > More details which could help understand what is going on: > > ps output: > > root 15928 0.3 0.0 255216 185268 ? Ss May21 10:11 > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf > 16988

Re: remaining process after (seamless) reload

2018-05-23 Thread William Dauchy
On Wed, May 23, 2018 at 06:49:09PM +0200, William Dauchy wrote: > We do frequent reloads (approximatively every 10s). > After a while some processes remains alive and seem to never exit (waited >24 > hours). While stracing them, some of them are still handling traffic and > doing healthchecks.

remaining process after (seamless) reload

2018-05-23 Thread William Dauchy
Hello, I am trying to understand a possible issue we have regarding haproxy (seamless) reloads. I am using haproxy v1.8.9 with the following config (using nbthread): global log 127.0.0.1 local0 info maxconn 262144 user haproxy group haproxy nbproc 1 daemon