Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-15 Thread Robin H. Johnson
On Thu, Apr 15, 2021 at 07:53:15PM +, Robin H. Johnson wrote: > But your thought of CPU pinning was good. > I went to confirm it in the host, and I'm not certain if the cpu-map is > working > right. Ignore me, long day and I didn't think to check each thread PID: # ps -e -T | grep haproxy -w

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-15 Thread Robin H. Johnson
On Thu, Apr 15, 2021 at 09:23:07AM +0200, Willy Tarreau wrote: > On Thu, Apr 15, 2021 at 07:13:53AM +, Robin H. Johnson wrote: > > Thanks; I will need to catch it faster or automate this, because the > > watchdog does a MUCH better job restarting it than before, less than 30 > > seconds of

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-15 Thread Willy Tarreau
On Thu, Apr 15, 2021 at 07:13:53AM +, Robin H. Johnson wrote: > Thanks; I will need to catch it faster or automate this, because the > watchdog does a MUCH better job restarting it than before, less than 30 > seconds of 100% CPU before the watchdog reliably kills it. I see. Then collecting

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-15 Thread Robin H. Johnson
On Thu, Apr 15, 2021 at 08:59:35AM +0200, Willy Tarreau wrote: > On Wed, Apr 14, 2021 at 01:53:06PM +0200, Christopher Faulet wrote: > > > nbthread=64, nbproc=1 on both 1.8/2.x > > > > It is thus surprising, if it is really a contention issue, that you never > > observed slow down on the 1.8.

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-15 Thread Willy Tarreau
On Wed, Apr 14, 2021 at 01:53:06PM +0200, Christopher Faulet wrote: > > nbthread=64, nbproc=1 on both 1.8/2.x > > It is thus surprising, if it is really a contention issue, that you never > observed slow down on the 1.8. There is no watchdog, but the thread > implementation is a bit awkward on

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-14 Thread Christopher Faulet
Le 10/04/2021 à 00:34, Robin H. Johnson a écrit : On Fri, Apr 09, 2021 at 10:14:26PM +0200, Christopher Faulet wrote: It seems you have a blocking call in one of your lua script. The threads dump shows many threads blocked in hlua_ctx_init. Many others are executing lua. Unfortunately, for a

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-09 Thread Robin H. Johnson
On Fri, Apr 09, 2021 at 10:14:26PM +0200, Christopher Faulet wrote: > It seems you have a blocking call in one of your lua script. The threads dump > shows many threads blocked in hlua_ctx_init. Many others are executing lua. > Unfortunately, for a unknown reason, there is no stack traceback.

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-09 Thread Christopher Faulet
Le 09/04/2021 à 19:26, Robin H. Johnson a écrit : Hi, Maciej had said they were going to create a new thread, but I didn't see one yet. I want to start by noting problem was much worse on 2.2.8 & 2.2.9, and that 2.2.13 & 2.3.9 don't get entirely hung at 100% anymore: a big thanks for that

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-09 Thread Maciej Zdeb
Hi Robin, W dniu pt., 9.04.2021 o 19:26 Robin H. Johnson napisał(a): > Maciej had said they were going to create a new thread, but I didn't see > one yet. > > I want to start by noting problem was much worse on 2.2.8 & 2.2.9, and > that 2.2.13 & 2.3.9 don't get entirely hung at 100% anymore: a

Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-09 Thread Robin H. Johnson
Hi, Maciej had said they were going to create a new thread, but I didn't see one yet. I want to start by noting problem was much worse on 2.2.8 & 2.2.9, and that 2.2.13 & 2.3.9 don't get entirely hung at 100% anymore: a big thanks for that initial work in fixing the issue. As I mentioned in my

Re: [2.2.9] 100% CPU usage

2021-04-02 Thread Maciej Zdeb
Hi Christopher, Yes I know, my issues are always pretty weird. ;) Of course it's not reproducible. :( I'll try to collect more data and return to you. I will start a new thread to not mix those two cases. Kind regards, pt., 2 kwi 2021 o 10:13 Christopher Faulet napisał(a): > Le 31/03/2021 à

Re: [2.2.9] 100% CPU usage

2021-04-02 Thread Christopher Faulet
Le 31/03/2021 à 13:28, Maciej Zdeb a écrit : Hi, Well it's a bit better situation than earlier because only one thread is looping forever and the rest is working properly. I've tried to verify where exactly the thread looped but doing "n" in gdb fixed the problem :( After quitting gdb

Re: [2.2.9] 100% CPU usage

2021-03-31 Thread Maciej Zdeb
I've forgot to mention that the backtrace is from 2.2.11 built from http://git.haproxy.org/?p=haproxy-2.2.git;a=commit;h=601704962bc9d82b3b1cc97d90d2763db0ae4479 śr., 31 mar 2021 o 13:28 Maciej Zdeb napisał(a): > Hi, > > Well it's a bit better situation than earlier because only one thread is >

Re: [2.2.9] 100% CPU usage

2021-03-31 Thread Maciej Zdeb
Hi, Well it's a bit better situation than earlier because only one thread is looping forever and the rest is working properly. I've tried to verify where exactly the thread looped but doing "n" in gdb fixed the problem :( After quitting gdb session all threads were idle. Before I started gdb it

Re: [2.2.9] 100% CPU usage

2021-03-25 Thread Christopher Faulet
Le 25/03/2021 à 13:38, Maciej Zdeb a écrit : Hi, I deployed a patched (with volatile hlua_not_dumpable) HAProxy and so far so good, no looping. Christopher I saw new patches with hlua_traceback used instead, looks much cleaner to me, should I verify them instead? :) Christopher & Willy I've

Re: [2.2.9] 100% CPU usage

2021-03-25 Thread Maciej Zdeb
Hi, I deployed a patched (with volatile hlua_not_dumpable) HAProxy and so far so good, no looping. Christopher I saw new patches with hlua_traceback used instead, looks much cleaner to me, should I verify them instead? :) Christopher & Willy I've forgotten to thank you for help! Kind regards,

Re: [2.2.9] 100% CPU usage

2021-03-24 Thread Maciej Zdeb
śr., 24 mar 2021 o 10:37 Christopher Faulet napisał(a): > However, reading the other trace Maciej sent (bussy_thread_peers.txt), it > seems > possible to stop a memory allocation from other places. Thus, I guess we > must > find a more global way to prevent the lua stack dump. > I'm not sure

Re: [2.2.9] 100% CPU usage

2021-03-24 Thread Christopher Faulet
Le 24/03/2021 à 10:16, Willy Tarreau a écrit : On Wed, Mar 24, 2021 at 10:11:19AM +0100, Maciej Zdeb wrote: Wow, that's it! :) 0x55d94949e965 <+53>: addl $0x1,%fs:0xfffdd688 0x55d94949e96e <+62>: callq 0x55d9494782c0 0x55d94949e973 <+67>: subl

Re: [2.2.9] 100% CPU usage

2021-03-24 Thread Willy Tarreau
On Wed, Mar 24, 2021 at 10:11:19AM +0100, Maciej Zdeb wrote: > Wow, that's it! :) > >0x55d94949e965 <+53>: addl $0x1,%fs:0xfffdd688 >0x55d94949e96e <+62>: callq 0x55d9494782c0 >0x55d94949e973 <+67>: subl $0x1,%fs:0xfffdd688 > [...] >

Re: [2.2.9] 100% CPU usage

2021-03-24 Thread Maciej Zdeb
Wow, that's it! :) 0x55d94949e965 <+53>: addl $0x1,%fs:0xfffdd688 0x55d94949e96e <+62>: callq 0x55d9494782c0 0x55d94949e973 <+67>: subl $0x1,%fs:0xfffdd688 [...] 0x55d94949e99f <+111>: ja 0x55d94949e9b0 0x55d94949e9a1 <+113>: mov

Re: [2.2.9] 100% CPU usage

2021-03-24 Thread Willy Tarreau
On Wed, Mar 24, 2021 at 09:52:22AM +0100, Willy Tarreau wrote: > So yes, it looks like gcc decides that a function called "malloc" will > never use your program's global variables but that "blablalloc" may. I > have no explanation to this except "optimization craziness" resulting > in breaking

Re: [2.2.9] 100% CPU usage

2021-03-24 Thread Willy Tarreau
On Wed, Mar 24, 2021 at 09:41:03AM +0100, Willy Tarreau wrote: > This is particularly strange. Could you please disassemble hlua_alloc ? > (dis hlua_alloc) ? > > You should find something like this: > >0x004476c3 <+147>: addDWORD PTR fs:0xfffdd678,0x1 >

Re: [2.2.9] 100% CPU usage

2021-03-24 Thread Willy Tarreau
On Wed, Mar 24, 2021 at 08:55:33AM +0100, Maciej Zdeb wrote: > After reading I wasn't sure anymore I even tested properly patched package. > :) Hehe, I know that this happens quite a lot when starting to play with different binaries. > Fortunately I have a core file so I verified if

Re: [2.2.9] 100% CPU usage

2021-03-24 Thread Maciej Zdeb
Hi, wt., 23 mar 2021 o 18:36 Willy Tarreau napisał(a): > > It is most probably because of compiler optimizations. Some compiler > > barriers are necessary to avoid instructions reordering. It is the > purpose > > of attached patches. Sorry to ask you it again, but could you make some > > tests

Re: [2.2.9] 100% CPU usage

2021-03-23 Thread Willy Tarreau
On Tue, Mar 23, 2021 at 04:12:41PM +0100, Christopher Faulet wrote: > Le 23/03/2021 à 11:14, Maciej Zdeb a écrit : > > Hi Christopher, > > > > Bad news, patches didn't help. Attaching stacktraces, now it looks that > > spoe that executes lua scripts (free_thread_spue_lua.txt) tried to > > malloc

Re: [2.2.9] 100% CPU usage

2021-03-23 Thread Christopher Faulet
Le 23/03/2021 à 11:14, Maciej Zdeb a écrit : Hi Christopher, Bad news, patches didn't help. Attaching stacktraces, now it looks that spoe that executes lua scripts (free_thread_spue_lua.txt) tried to malloc twice. :( It is most probably because of compiler optimizations. Some compiler

Re: [2.2.9] 100% CPU usage

2021-03-23 Thread Maciej Zdeb
Hi Christopher, Bad news, patches didn't help. Attaching stacktraces, now it looks that spoe that executes lua scripts (free_thread_spue_lua.txt) tried to malloc twice. :( Kind regards, pon., 22 mar 2021 o 08:39 Maciej Zdeb napisał(a): > Hi Christopher, > > Thanks! I'm building a patched

Re: [2.2.9] 100% CPU usage

2021-03-22 Thread Maciej Zdeb
Hi Christopher, Thanks! I'm building a patched version and will return with feedback! Kind regards, pt., 19 mar 2021 o 16:40 Christopher Faulet napisał(a): > Le 16/03/2021 à 13:46, Maciej Zdeb a écrit : > > Sorry for spam. In the last message I said that the old process (after > reload) > >

Re: [2.2.9] 100% CPU usage

2021-03-19 Thread Christopher Faulet
Le 16/03/2021 à 13:46, Maciej Zdeb a écrit : Sorry for spam. In the last message I said that the old process (after reload) is consuming cpu for lua processing and that's not true, it is processing other things also. I'll take a break. ;) Then I'll verify if the issue exists on 2.3 and maybe

Re: [2.2.9] 100% CPU usage

2021-03-17 Thread Maciej Zdeb
Hi Christopher, That's good news! If you need me to test a patch then let me know. On my side I'm preparing to update HAProxy to 2.3 and solving some simple issues like lacking new lines on the end of configuration. ;) Kind regards, śr., 17 mar 2021 o 10:49 Christopher Faulet napisał(a): >

Re: [2.2.9] 100% CPU usage

2021-03-17 Thread Christopher Faulet
Le 16/03/2021 à 13:46, Maciej Zdeb a écrit : Sorry for spam. In the last message I said that the old process (after reload) is consuming cpu for lua processing and that's not true, it is processing other things also. I'll take a break. ;) Then I'll verify if the issue exists on 2.3 and maybe

Re: [2.2.9] 100% CPU usage

2021-03-16 Thread Willy Tarreau
On Tue, Mar 16, 2021 at 01:46:48PM +0100, Maciej Zdeb wrote: > In the last message I said that the old process (after > reload) is consuming cpu for lua processing and that's not true, it is > processing other things also. > > I'll take a break. ;) Then I'll verify if the issue exists on 2.3 and

Re: [2.2.9] 100% CPU usage

2021-03-16 Thread Maciej Zdeb
Sorry for spam. In the last message I said that the old process (after reload) is consuming cpu for lua processing and that's not true, it is processing other things also. I'll take a break. ;) Then I'll verify if the issue exists on 2.3 and maybe 2.4 branch. For each version I need a week or two

Re: [2.2.9] 100% CPU usage

2021-03-16 Thread Maciej Zdeb
Below is the output from perf top - it happens during reload (all threads on an old process spike and use 100% cpu, processing lua) and after 15-30 seconds old process exits. It is probably a different bug, because it happens on an old process and I have no idea how it could affect the new

Re: [2.2.9] 100% CPU usage

2021-03-16 Thread Maciej Zdeb
Sure, patch from Christopher attached. :) wt., 16 mar 2021 o 10:58 Willy Tarreau napisał(a): > Hi Maciej, > > On Tue, Mar 16, 2021 at 10:55:11AM +0100, Maciej Zdeb wrote: > > Hi, > > > > I'm returning with bad news, the patch did not help and the issue > occurred > > today (on patched 2.2.10).

Re: [2.2.9] 100% CPU usage

2021-03-16 Thread Willy Tarreau
Hi Maciej, On Tue, Mar 16, 2021 at 10:55:11AM +0100, Maciej Zdeb wrote: > Hi, > > I'm returning with bad news, the patch did not help and the issue occurred > today (on patched 2.2.10). It is definitely related to reloads, however it > is very rare issue it worked flawlessly the whole week. OK.

Re: [2.2.9] 100% CPU usage

2021-03-16 Thread Maciej Zdeb
Hi, I'm returning with bad news, the patch did not help and the issue occurred today (on patched 2.2.10). It is definitely related to reloads, however it is very rare issue it worked flawlessly the whole week. wt., 9 mar 2021 o 09:17 Willy Tarreau napisał(a): > On Tue, Mar 09, 2021 at

Re: [2.2.9] 100% CPU usage

2021-03-09 Thread Willy Tarreau
On Tue, Mar 09, 2021 at 09:04:43AM +0100, Maciej Zdeb wrote: > Hi, > > After applying the patch, the issue did not occur, however I'm still not > sure it is fixed. Unfortunately I don't have a reliable way to trigger it. OK. If it's related, it's very possible that some of the issues we've

Re: [2.2.9] 100% CPU usage

2021-03-09 Thread Maciej Zdeb
Hi, After applying the patch, the issue did not occur, however I'm still not sure it is fixed. Unfortunately I don't have a reliable way to trigger it. pt., 5 mar 2021 o 22:07 Willy Tarreau napisał(a): > Note, before 2.4, a single thread can execute Lua scripts at once, > with the others

Re: [2.2.9] 100% CPU usage

2021-03-05 Thread Willy Tarreau
On Fri, Mar 05, 2021 at 12:00:52PM +0100, Christopher Faulet wrote: > Le 05/03/2021 à 11:35, Maciej Zdeb a écrit : > > Hi Christopher, > > > > Thanks, I'll check but it'll take a couple days because the issue is > > quite rare. I'll return with feedback! > > > > Maybe the patch is not backported

Re: [2.2.9] 100% CPU usage

2021-03-05 Thread Christopher Faulet
Le 05/03/2021 à 11:35, Maciej Zdeb a écrit : Hi Christopher, Thanks, I'll check but it'll take a couple days because the issue is quite rare. I'll return with feedback! Maybe the patch is not backported to 2.2 because of commit message that states only 2.3 branch? That's it. And it was

Re: [2.2.9] 100% CPU usage

2021-03-05 Thread Maciej Zdeb
Hi Christopher, Thanks, I'll check but it'll take a couple days because the issue is quite rare. I'll return with feedback! Maybe the patch is not backported to 2.2 because of commit message that states only 2.3 branch? Kind regards, czw., 4 mar 2021 o 22:34 Christopher Faulet napisał(a): >

Re: [2.2.9] 100% CPU usage

2021-03-04 Thread Christopher Faulet
Le 04/03/2021 à 14:01, Maciej Zdeb a écrit : Hi, Sometimes after HAProxy reload it starts to loop infinitely, for example 9 of 10 threads using 100% CPU (gdb sessions attached). I've also dumped the core file from gdb. Hi Maciej, The 2.2.1O is out. But I'm afraid that a fix is missing.