On Fri, Jun 22, 2018 at 2:28 PM William Lallemand
wrote:
> The seamless reload only transfers the listeners, the remaining connections
> are
> handled in the leaving process. It would be complicated to handle them in a
> new
> process with a different configuration.
Thanks for the clarification
On Fri, Jun 22, 2018 at 02:41:44PM +0200, Christopher Faulet wrote:
> You're right, removing the thread from all_threads_mask when it exits is a
> good idea. Here a patch to do so.
merged as well, thanks!
Willy
Le 20/06/2018 à 18:29, Willy Tarreau a écrit :
On Wed, Jun 20, 2018 at 04:42:58PM +0200, Christopher Faulet wrote:
When HAProxy is shutting down, it exits the polling loop when there is no jobs
anymore (jobs == 0). When there is no thread, it works pretty well, but when
HAProxy is started with s
On Fri, Jun 22, 2018 at 12:03:22PM +0200, William Dauchy wrote:
> On Thu, Jun 21, 2018 at 5:21 PM William Lallemand
> wrote:
> > Once you are sure this is not a bug and that a client is still connected,
> > you
> > could use the keyword 'hard-stop-after' to force a hard stop.
>
> After double ch
On Thu, Jun 21, 2018 at 5:21 PM William Lallemand
wrote:
> Once you are sure this is not a bug and that a client is still connected, you
> could use the keyword 'hard-stop-after' to force a hard stop.
After double checking some cases, indeed there are still a few
remaining established connections
On Thu, Jun 21, 2018 at 05:20:19PM +0200, William Lallemand wrote:
> I'm waiting a few days for your feedback and we will probably release a new
> 1.8
> version including those fixes.
I've merged it in master now.
Willy
On Thu, Jun 21, 2018 at 05:10:35PM +0200, William Dauchy wrote:
> On Thu, Jun 21, 2018 at 5:03 PM William Lallemand
> wrote:
> > Maybe one client was still connected on a frontend (including the stats
> > socket).
> > The process first unbind the listeners, and then wait for all clients to
> > l
On Thu, Jun 21, 2018 at 5:03 PM William Lallemand
wrote:
> Maybe one client was still connected on a frontend (including the stats
> socket).
> The process first unbind the listeners, and then wait for all clients to
> leave.
> It's difficult to see what's going on since the stats socket is unbi
On Thu, Jun 21, 2018 at 04:47:50PM +0200, William Dauchy wrote:
> Hi Christopher,
>
> A quick followup from this morning.
>
> On Thu, Jun 21, 2018 at 10:41 AM William Dauchy wrote:
> > it seems better now, but not completely gone, in a way, I think we now
> > have a new issue.
> > this morning,
Hi Christopher,
A quick followup from this morning.
On Thu, Jun 21, 2018 at 10:41 AM William Dauchy wrote:
> it seems better now, but not completely gone, in a way, I think we now
> have a new issue.
> this morning, on one test machine I have a process which remains polling
> traffic
so it exit
Hello Christopher,
Thanks for the followup patch.
On Wed, Jun 20, 2018 at 04:42:58PM +0200, Christopher Faulet wrote:
> Hum, ok, forget the previous patch. Here is a second try. It solves the same
> bug using another way. In this patch, all threads must enter in the sync
> point to exit. I hope i
On Wed, Jun 20, 2018 at 04:42:58PM +0200, Christopher Faulet wrote:
> When HAProxy is shutting down, it exits the polling loop when there is no jobs
> anymore (jobs == 0). When there is no thread, it works pretty well, but when
> HAProxy is started with several threads, a thread can decide to exit
Le 20/06/2018 à 15:11, William Dauchy a écrit :
Hello Christopher,
Thank you for the quick answer and the patch.
On Wed, Jun 20, 2018 at 11:32 AM Christopher Faulet wrote:
Here is a patch to avoid a thread to exit its polling loop while others
are waiting in the sync point. It is a theoretica
Hello Christopher,
Thank you for the quick answer and the patch.
On Wed, Jun 20, 2018 at 11:32 AM Christopher Faulet wrote:
> Here is a patch to avoid a thread to exit its polling loop while others
> are waiting in the sync point. It is a theoretical patch because I was
> not able to reproduce t
Le 19/06/2018 à 16:42, William Dauchy a écrit :
On Tue, Jun 19, 2018 at 4:30 PM William Lallemand
wrote:
That's interesting, we can suppose that this bug is not related anymore to the
signal problem we had previously.
Looks like it's blocking in the thread sync point.
Are you able to do a backt
On Tue, Jun 19, 2018 at 4:30 PM William Lallemand
wrote:
> That's interesting, we can suppose that this bug is not related anymore to the
> signal problem we had previously.
> Looks like it's blocking in the thread sync point.
> Are you able to do a backtrace with gdb? that could help a lot.
yes,
On Tue, Jun 19, 2018 at 04:09:51PM +0200, William Dauchy wrote:
> Hello William,
>
> Not much progress on my side, apart from the fact I forgot to mention
> where the process are now stuck using all the cpu, in
> src/hathreads.c:112
> while (*barrier != all_threads_mask)
> pl_cpu_relax();
>
Th
Hello William,
Not much progress on my side, apart from the fact I forgot to mention
where the process are now stuck using all the cpu, in
src/hathreads.c:112
while (*barrier != all_threads_mask)
pl_cpu_relax();
--
William
Hello,
Thanks for your answer. Here are the information requested.
On Fri, Jun 15, 2018 at 11:22 AM William Lallemand
wrote:
> - haproxy -vv
HA-Proxy version 1.8.9-83616ec 2018/05/18
Copyright 2000-2018 Willy Tarreau
Build options :
TARGET = linux2628
CPU = generic
CC = gcc
On Tue, Jun 12, 2018 at 04:56:24PM +0200, William Dauchy wrote:
> On Tue, Jun 12, 2018 at 04:33:43PM +0200, William Lallemand wrote:
> > Those processes are still using a lot of CPU...
> > Are they still delivering traffic?
>
> they don't seem to handle any traffic (at least I can't see it through
On Tue, Jun 12, 2018 at 04:33:43PM +0200, William Lallemand wrote:
> Those processes are still using a lot of CPU...
> Are they still delivering traffic?
they don't seem to handle any traffic (at least I can't see it through strace)
but that's the main difference here, using lots of CPU.
> > stra
On Tue, Jun 12, 2018 at 04:00:25PM +0200, William Dauchy wrote:
> Hello William L,
>
> On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote:
> > That's great news!
> >
> > Here's the new patches. It shouldn't change anything to the fix, it only
> > changes the sigprocmask to pthread_s
Hello William L,
On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote:
> That's great news!
>
> Here's the new patches. It shouldn't change anything to the fix, it only
> changes the sigprocmask to pthread_sigmask.
In fact, I now have a different but similar issue.
root 18547 3
On Fri, Jun 08, 2018 at 06:22:39PM +0200, William Dauchy wrote:
> On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote:
> > That's great news!
> >
> > Here's the new patches. It shouldn't change anything to the fix, it only
> > changes the sigprocmask to pthread_sigmask.
>
> thanks, I
On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote:
> That's great news!
>
> Here's the new patches. It shouldn't change anything to the fix, it only
> changes the sigprocmask to pthread_sigmask.
thanks, I attached the backport for 1.8 and started a new test with
them.
Feel free to
On Fri, Jun 08, 2018 at 06:20:21PM +0200, Willy Tarreau wrote:
> On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote:
> > That's great news!
> >
> > Here's the new patches. It shouldn't change anything to the fix, it only
> > changes the sigprocmask to pthread_sigmask.
>
> OK, I ca
On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote:
> That's great news!
>
> Here's the new patches. It shouldn't change anything to the fix, it only
> changes the sigprocmask to pthread_sigmask.
OK, I can merge them right now if you want. At the very least it will
kill a whole cl
On Fri, Jun 08, 2018 at 02:10:44PM +0200, William Dauchy wrote:
> On Thu, Jun 07, 2018 at 11:50:45AM +0200, William Lallemand wrote:
> > Sorry for the late reply, I manage to reproduce and fix what seams to be
> > the bug.
> > The signal management was not handled correctly with threads.
> > Could
Hello William L.,
On Thu, Jun 07, 2018 at 11:50:45AM +0200, William Lallemand wrote:
> Sorry for the late reply, I manage to reproduce and fix what seams to be the
> bug.
> The signal management was not handled correctly with threads.
> Could you try those patches and see if it fixes the problem?
On Thu, Jun 07, 2018 at 12:02:46PM +0200, Willy Tarreau wrote:
> On Thu, Jun 07, 2018 at 11:50:45AM +0200, William Lallemand wrote:
> > /* block signal delivery during processing */
> > +#ifdef USE_THREAD
> > + pthread_sigmask(SIG_SETMASK, &blocked_sig, &old_sig);
> > +#else
> > sigprocma
On Thu, Jun 07, 2018 at 11:50:45AM +0200, William Lallemand wrote:
> /* block signal delivery during processing */
> +#ifdef USE_THREAD
> + pthread_sigmask(SIG_SETMASK, &blocked_sig, &old_sig);
> +#else
> sigprocmask(SIG_SETMASK, &blocked_sig, &old_sig);
> +#endif
I think for the
Hi guys,
Sorry for the late reply, I manage to reproduce and fix what seams to be the
bug.
The signal management was not handled correctly with threads.
Could you try those patches and see if it fixes the problem?
Thanks.
--
William Lallemand
>From d695242fb260538bd8db323715d627c4a9deacc7 Mon
On Wed, May 30, 2018 at 07:57:03PM +0200, Tim Düsterhus wrote:
> William,
>
> Am 30.05.2018 um 19:45 schrieb William Lallemand:
> >> @William Lallemand Possibly the sd_notifyf should be moved below
> >> mworker_unblock_signals in mworker_wait?
> >>
> >
> > This shouldn't happen with or without s
On Wed, May 30, 2018 at 5:29 PM, William Lallemand
wrote:
> I can reproduce the same situation there, however I disabled the seamless
> reload. When doing a -USR1 & strace on an remaining worker, I can see that the
> the signal is not blocked, and that it's still polling
good news!
> Unfortunate
William,
Am 30.05.2018 um 19:45 schrieb William Lallemand:
>> @William Lallemand Possibly the sd_notifyf should be moved below
>> mworker_unblock_signals in mworker_wait?
>>
>
> This shouldn't happen with or without systemd. I can reproduce it without
> using systemd, we should not rely on an e
Hi Tim,
On Tue, May 29, 2018 at 09:33:48PM +0200, Tim Düsterhus wrote:
>
> @William Lallemand Possibly the sd_notifyf should be moved below
> mworker_unblock_signals in mworker_wait?
>
This shouldn't happen with or without systemd. I can reproduce it without
using systemd, we should not rely
On Wed, May 30, 2018 at 04:47:31PM +0200, William Dauchy wrote:
> Hello William L.,
>
Hi William D. :-)
> I did some more testing:
> I simplified my config, removing the multi binding part and cpu-map.
> Conclusion is, I have this issue when I activate nbthread feature
> (meaning no probkem with
Hello William L.,
I did some more testing:
I simplified my config, removing the multi binding part and cpu-map.
Conclusion is, I have this issue when I activate nbthread feature
(meaning no probkem without).
I tried to kill -USR1 the failing worker, but it remains.
Here are the Sig* from status
On Tue, May 29, 2018 at 08:35:19PM +0200, William Dauchy wrote:
> I however don't see on which part haproxy would
> need to do dns lookup on our side. Front end side is host matching and
> backend side is IP only.
We studied the possibility that a reload happends at the exact moment
the config fin
Hello Tim,
On Tue, May 29, 2018 at 9:33 PM, Tim Düsterhus wrote:
> Run systemctl status haproxy. It shows the status:
>
>> [timwolla@/s/haproxy (maxrewrite-warn)]sudo systemctl status haproxy
>> ● haproxy.service - HAProxy Load Balancer
>>Loaded: loaded (/lib/systemd/system/haproxy.service; d
William,
Am 29.05.2018 um 17:09 schrieb William Dauchy:
> We are using Type=notify. I however cannot guarantee we do not trigger
> a new reload, before the previous one is done. Is there a way to check
> the "ready" state you mentioned?
Run systemctl status haproxy. It shows the status:
> [timwo
Hello Dave,
On Tue, May 29, 2018 at 5:55 PM, Dave Chiluk wrote:
> We've battled the same issue with our haproxys. We root caused it to slow
> dns lookup times while parsing the config was causing haproxy config parsing
> to take so long that we were attempting to reload again before the original
We've battled the same issue with our haproxys. We root caused it to slow
dns lookup times while parsing the config was causing haproxy config
parsing to take so long that we were attempting to reload again before the
original reload had completed. I'm still not sure why or where the Signals
are
Hello William,
Sorry for the last answer.
> Are the problematical workers leaving when you reload a second time?
no, they seems to stay for a long time (forever?)
> Did you try to kill -USR1 the worker ? It should exits with "Former worker
> $PID
> exited with code 0" on stderr.
> If not, coul
On Thu, May 24, 2018 at 11:00:29PM +0200, William Dauchy wrote:
> On Thu, May 24, 2018 at 12:01:38PM +0200, William Lallemand wrote:
> > I managed to reproduce something similar with the 1.8.8 version. It looks
> > like
> > letting a socat connected to the socket helps.
> >
> > I'm looking into th
Hi William,
Thank you for your reply.
On Thu, May 24, 2018 at 12:01:38PM +0200, William Lallemand wrote:
> I managed to reproduce something similar with the 1.8.8 version. It looks like
> letting a socat connected to the socket helps.
>
> I'm looking into the code to see what's happening.
Indeed
On Thu, May 24, 2018 at 10:07:23AM +0200, William Dauchy wrote:
> On Wed, May 23, 2018 at 08:45:04PM +0200, William Dauchy wrote:
> > More details which could help understand what is going on:
> >
> > ps output:
> >
> > root 15928 0.3 0.0 255216 185268 ? Ss May21 10:11
> > /usr/sbin
On Wed, May 23, 2018 at 08:45:04PM +0200, William Dauchy wrote:
> More details which could help understand what is going on:
>
> ps output:
>
> root 15928 0.3 0.0 255216 185268 ? Ss May21 10:11
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf
> 16988 1691
On Wed, May 23, 2018 at 06:49:09PM +0200, William Dauchy wrote:
> We do frequent reloads (approximatively every 10s).
> After a while some processes remains alive and seem to never exit (waited >24
> hours). While stracing them, some of them are still handling traffic and
> doing healthchecks. Some
Hello,
I am trying to understand a possible issue we have regarding haproxy (seamless)
reloads.
I am using haproxy v1.8.9 with the following config (using nbthread):
global
log 127.0.0.1 local0 info
maxconn 262144
user haproxy
group haproxy
nbproc 1
daemon
stats
50 matches
Mail list logo