Almost identical to 13e86d9 (BUG/MEDIUM: listener: Fix race condition when updating the global mngmt task). When multiple `listener_accept` threads attempt to schedule the proxy task which calls `manage_proxy`, one listener thread may be sufficiently delayed to cause a race between its call to `task_schedule` and `manage_proxy`'s clearing of the `expire` field. This only occurs when `rate-limit sessions` is set and the server is sufficiently loaded to cause the thread scheduling issues resulting in the listener task delay. This results in the following `BUG_ON` in the listener task:
FATAL: bug condition "task->expire == 0" matched at src/task.c:285 call trace(11): | 0x56eb28 [0f 0b 66 0f 1f 44 00 00]: __task_queue+0xc8/0xca | 0x553903 [eb 9c 0f 1f 00 ba 03 00]: main+0x1317e3 | 0x556de6 [8b 54 24 18 85 d2 0f 84]: listener_accept+0x12c6/0x14f6 | 0x5ac348 [4d 8b 3c 24 49 01 df 64]: fd_update_events+0x1a8/0x4a8 | 0x427d58 [48 39 eb 74 3b 66 66 66]: main+0x5c38 | 0x53801f [64 48 8b 04 25 00 00 00]: run_poll_loop+0x10f/0x647 | 0x5387f5 [49 c7 c4 c0 47 6a 00 49]: main+0x1166d5 | 0x7f7f662b4333 [e9 74 fe ff ff 48 8b 44]: libc:+0x8b333 | 0x7f7f66336efc [48 89 c7 b8 3c 00 00 00]: libc:+0x10defc To fix it, we wrap the call to `task_schedule` in `listener_accept` and the setting of `task->expire` in `manage_proxy` with the proxy's RW lock. The same semantics are used as in 13e86d9. The `task_schedule` call is read locked and the setting of `expire` to maybe-ETERNITY is write locked. This will fix issue #2289. It must be backported to 2.4 onward, and should apply cleanly as the relevant code hasn't changed since then. --- src/listener.c | 4 ++++ src/proxy.c | 2 ++ 2 files changed, 6 insertions(+) diff --git a/src/listener.c b/src/listener.c index 86d0945da..594ed2c7f 100644 --- a/src/listener.c +++ b/src/listener.c @@ -1574,7 +1574,11 @@ void listener_accept(struct listener *l) */ limit_listener(l, &p->listener_queue); if (p->task && tick_isset(expire)) + { + HA_RWLOCK_RDLOCK(PROXY_LOCK, &p->lock); task_schedule(p->task, expire); + HA_RWLOCK_RDUNLOCK(PROXY_LOCK, &p->lock); + } goto end; } diff --git a/src/proxy.c b/src/proxy.c index 6451cbb8a..25a6e4d99 100644 --- a/src/proxy.c +++ b/src/proxy.c @@ -2062,7 +2062,9 @@ struct task *manage_proxy(struct task *t, void *context, unsigned int state) /* The proxy is not limited so we can re-enable any waiting listener */ dequeue_proxy_listeners(p); out: + HA_RWLOCK_WRLOCK(PROXY_LOCK, &p->lock); t->expire = next; + HA_RWLOCK_WRUNLOCK(PROXY_LOCK, &p->lock); task_queue(t); return t; } -- 2.43.0