=== Resending this, with the threading broken, so that other readers hopefully see it.
It was in the thread <cae9ejaee6zhgp+oit5ccyblxs2dmaxmxytq8c++qa-n4m8_...@mail.gmail.com> previously. === Hi, This is a followup to the prior threads about 100% in 2.2.x & 2.3.x; where I referenced heavy workloads causing HAProxy to initially hit 100% CPU, but then after the watchdog detection was added, they just killed the process instead. After months searching, at work we stumbled onto an internally usable-only reproduction case using a tool we wrote that made millions of requests: Turning it up around ~6K RPS w/ lots of the headers being processed by our Lua code triggered the issue, running on a single-sock EPYC 7702P system. We also found a surprising mitigation: enabling multithreaded Lua w/ "lua-load-per-thread" made the problem go away entirely (and gave a modest 10% performance boost, we are mostly limited by backend servers, not HAProxy or Lua). The Lua script was described in the previous script, and only does complex string parsing, used for variables, and driving some applets. It doesn't do any blocking operations, sockets, files or rely on globals. It got a few cleanups for multi-threaded usage (forcing more variables to be explicitly local), but has no other significant changes relevant to this discussion (it had some business logic changes to string handling used to compute stick table keys, but not really functionality changes). The full errors are attached along with decoded core dump, with some details redacted per $work security team requirements. Repeated the error twice and both attempts are attached, 4 files in total. I'll repeat the short form here for interest from just one of the occurrences: ==== Thread 23 is about to kill the process. ... *>Thread 23: id=0x7f78d4ff9700 act=1 glob=1 wq=1 rq=1 tl=1 tlsz=8 rqsz=30 stuck=1 prof=0 harmless=0 wantrdv=0 cpu_ns: poll=63597658036 now=66142046101 diff=2544388065 curr_task=0x7f7888c9db60 (task) calls=1 last=0 fct=0x55e4b662ca30(process_stream) ctx=0x7f78890147e0 strm=0x7f78890147e0 src=REDACTED-CLIENT-IP fe=tls-ipv4 be=tls-ipv4 dst=unknown rqf=40d08002 rqa=30 rpf=80000000 rpa=0 sif=EST,200020 sib=INI,30 af=(nil),0 csf=0x7f7888d02bb0,104000 ab=(nil),0 csb=(nil),0 cof=0x7f78b0685500,80003300:H1(0x7f78886e1fa0)/SSL(0x7f78886c3270)/tcpv4(3900) cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0) Current executing Lua from a stream analyser -- stack traceback: call trace(20): | 0x55e4b6759008 [eb c5 66 0f 1f 44 00 00]: wdt_handler+0x98/0x129 | 0x7f78ef9f4980 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x12980 | 0x55e4b65d650a [48 8b 05 2f b9 45 00 48]: main+0x346da | 0x55e4b667778d [85 c0 0f 84 8b 00 00 00]: sample_process+0x4d/0x127 | 0x55e4b670ba99 [48 85 c0 0f 84 c1 00 00]: main+0x169c69 | 0x55e4b660a421 [83 f8 07 4c 8b 0c 24 0f]: main+0x685f1 | 0x55e4b660d436 [83 f8 07 48 8b 4c 24 20]: http_process_req_common+0xf6/0x1659 | 0x55e4b662e508 [85 c0 0f 85 0a f4 ff ff]: [NOTICE] (35377) : haproxy version is 2.4.2-1ppa1~bionic [NOTICE] (35377) : path to executable is /usr/sbin/haproxy [ALERT] (35377) : Current worker #1 (35404) exited with code 134 (Aborted) [ALERT] (35377) : exit-on-failure: killing every processes with SIGTERM [WARNING] (35377) : All workers exited. Exiting... (134) ==== -- Robin Hugh Johnson E-Mail : robb...@orbis-terrarum.net Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
syslog.35404.gz
Description: Binary data
syslog.34872.gz
Description: Binary data
core.haproxy.35404.HOSTNAME.1627083123.txt.gz
Description: Binary data
core.haproxy.34872.HOSTNAME.1627082514.txt.gz
Description: Binary data
signature.asc
Description: PGP signature