On Fri, Aug 20, 2021 at 03:10:05PM +0200, Willy Tarreau wrote:
> BTW maybe we should arrange to take the Lua lock inside an externally
> visible function that could be resolved. It would more easily show up in
> case of trouble so that the issue becomes more obvious.
And it works pretty well!
- perf top
86.61% haproxy [.] lua_take_global_lock
9.69% haproxy [.] luaV_execute
2.72% haproxy [.] luaG_traceexec
0.17% [kernel] [k] native_io_delay
0.07% [i2c_i801] [k] 0x0000000000000cf2
0.03% [kernel] [k] acpi_os_read_port
0.02% perf [.] rb_next
0.02% libc-2.30.so [.] __strcmp_avx2
0.01% perf [.] __symbols__insert
0.01% perf [.] dso__find_symbol
0.01% [kernel] [k] kallsyms_expand_symbol.constprop.0
0.01% libc-2.30.so [.] _int_malloc
- "show threads"
>Thread 4 : id=0x7f561b677700 act=1 glob=0 wq=1 rq=0 tl=1 tlsz=0 rqsz=1
stuck=1 prof=1 harmless=0 wantrdv=0
cpu_ns: poll=43427720070 now=43835242784 diff=407522714
curr_task=0x7f560c02e9c0 (task) calls=1 last=408413622 ns ago
fct=0x4d3b20(process_stream) ctx=(nil)
call trace(13):
| 0x5a0c3b [85 c0 75 19 48 81 c4 c8]:
ha_dump_backtrace+0x2b/0x299
| 0x5a209f [48 83 c4 38 5b 5d 41 5c]:
ha_thread_dump+0x22f/0x281
| 0x5a2166 [48 8b 05 2b 2c 20 00 48]:
debug_handler+0x66/0x10b
| 0x7f5621ef6690 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x13690
| 0x46341a [eb cc 0f 1f 40 00 48 b8]:
lua_take_global_lock+0x4a/0x4c
| 0x46987d [64 48 8b 04 25 00 00 00]:
hlua_ctx_destroy+0xcd/0x35a
| 0x4d77ea [49 83 bf f0 00 00 00 00]:
process_stream+0x3cca/0x4e18
- panic:
>Thread 8 : id=0x7f8ec57fa700 act=1 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0
stuck=1 prof=0 harmless=0 wantrdv=0
cpu_ns: poll=611690 now=1994658802 diff=1994047112
curr_task=0x7f8eb402e9c0 (task) calls=1 last=0
fct=0x4d3b20(process_stream) ctx=0x7f8eb402e550
strm=0x7f8eb402e550 src=127.0.0.1 fe=fe4 be=fe4 dst=unknown
rqf=40d08002 rqa=30 rpf=80000000 rpa=0 sif=EST,200020 sib=INI,30
af=(nil),0 csf=0x7f8eb402e500,4000
ab=(nil),0 csb=(nil),0
cof=0x7f8eb802a2c0,80001300:H1(0x7f8eb4026240)/RAW((nil))/tcpv4(40)
cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0)
Current executing Lua from a stream analyser --
call trace(20):
| 0x5a0c3b [85 c0 75 19 48 81 c4 c8]:
ha_dump_backtrace+0x2b/0x299
| 0x5a209f [48 83 c4 38 5b 5d 41 5c]:
ha_thread_dump+0x22f/0x281
| 0x5a2166 [48 8b 05 2b 2c 20 00 48]:
debug_handler+0x66/0x10b
| 0x7f8ece7d3690 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x13690
| 0x46341a [eb cc 0f 1f 40 00 48 b8]:
lua_take_global_lock+0x4a/0x4c
| 0x466c17 [e9 fe fe ff ff 0f 1f 40]:
hlua_ctx_init+0x147/0x1cd
| 0x469592 [85 c0 74 3a 48 8b 44 24]: main+0x3dc92
| 0x51dacb [85 c0 74 61 48 8b 5d 20]:
sample_process+0x4b/0xf7
| 0x51e55c [48 85 c0 74 3f 64 48 63]:
sample_fetch_as_type+0x3c/0x9b
| 0x525613 [48 89 c6 48 85 c0 0f 84]:
sess_build_logline+0x2443/0x3cae
| 0x4af0be [4c 63 e8 4c 03 6d 10 4c]:
http_apply_redirect_rule+0xbfe/0xdf8
| 0x4af523 [83 f8 01 19 c0 83 e0 03]: main+0x83c23
| 0x4b2326 [83 f8 07 0f 87 99 00 00]:
http_process_req_common+0xf6/0x15f6
| 0x4d5b30 [85 c0 0f 85 9f f5 ff ff]:
process_stream+0x2010/0x4e18
And from what I'm seeing the forceyield setting doesn't help against this
because the system is really not making any visible progress and non-lua
stuff is blocked (the CLI is unresponsive on short loops).
Thus I'm going to commit this. It doesn't show any measurable performance
change considering how long that lock is usually taken!
For reference, the Lua code here does this:
core.register_fetches("loop", function(txn) for i=1,100000000 do end return
"/" end)
And the haproxy conf is this:
global
stats socket /tmp/sock1 level admin
# tune.lua.forced-yield 10
# lua-load-per-thread loop.lua
lua-load loop.lua
defaults
mode http
timeout client 10s
timeout server 10s
timeout connect 10s
frontend fe4
bind :9003
http-request redirect location %[lua.loop]
Willy