Hi, Well it's a bit better situation than earlier because only one thread is looping forever and the rest is working properly. I've tried to verify where exactly the thread looped but doing "n" in gdb fixed the problem :( After quitting gdb session all threads were idle. Before I started gdb it looped about 3h not serving any traffic, because I've put it into maintenance as soon as I observed abnormal cpu usage.
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f2cf0df6a47 in epoll_wait (epfd=3, events=0x55d7aaa04920, maxevents=200, timeout=timeout@entry=39) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30 30 ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory. (gdb) thread 11 [Switching to thread 11 (Thread 0x7f2c3c53d700 (LWP 20608))] #0 trace (msg=..., cb=<optimized out>, a4=<optimized out>, a3=<optimized out>, a2=<optimized out>, a1=<optimized out>, func=<optimized out>, where=..., src=<optimized out>, mask=<optimized out>, level=<optimized out>) at include/haproxy/trace.h:149 149 if (unlikely(src->state != TRACE_STATE_STOPPED)) (gdb) bt #0 trace (msg=..., cb=<optimized out>, a4=<optimized out>, a3=<optimized out>, a2=<optimized out>, a1=<optimized out>, func=<optimized out>, where=..., src=<optimized out>, mask=<optimized out>, level=<optimized out>) at include/haproxy/trace.h:149 #1 h2_resume_each_sending_h2s (h2c=h2c@entry=0x7f2c18dca740, head=head@entry=0x7f2c18dcabf8) at src/mux_h2.c:3255 #2 0x000055d7a426c8e2 in h2_process_mux (h2c=0x7f2c18dca740) at src/mux_h2.c:3329 #3 h2_send (h2c=h2c@entry=0x7f2c18dca740) at src/mux_h2.c:3479 #4 0x000055d7a42734bd in h2_process (h2c=h2c@entry=0x7f2c18dca740) at src/mux_h2.c:3624 #5 0x000055d7a4276678 in h2_io_cb (t=<optimized out>, ctx=0x7f2c18dca740, status=<optimized out>) at src/mux_h2.c:3583 #6 0x000055d7a4381f62 in run_tasks_from_lists (budgets=budgets@entry=0x7f2c3c51a35c) at src/task.c:454 #7 0x000055d7a438282d in process_runnable_tasks () at src/task.c:679 #8 0x000055d7a4339467 in run_poll_loop () at src/haproxy.c:2942 #9 0x000055d7a4339819 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3107 #10 0x00007f2cf1e606db in start_thread (arg=0x7f2c3c53d700) at pthread_create.c:463 #11 0x00007f2cf0df671f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb) bt full #0 trace (msg=..., cb=<optimized out>, a4=<optimized out>, a3=<optimized out>, a2=<optimized out>, a1=<optimized out>, func=<optimized out>, where=..., src=<optimized out>, mask=<optimized out>, level=<optimized out>) at include/haproxy/trace.h:149 No locals. #1 h2_resume_each_sending_h2s (h2c=h2c@entry=0x7f2c18dca740, head=head@entry=0x7f2c18dcabf8) at src/mux_h2.c:3255 h2s = <optimized out> h2s_back = <optimized out> __FUNCTION__ = "h2_resume_each_sending_h2s" __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> #2 0x000055d7a426c8e2 in h2_process_mux (h2c=0x7f2c18dca740) at src/mux_h2.c:3329 __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> #3 h2_send (h2c=h2c@entry=0x7f2c18dca740) at src/mux_h2.c:3479 flags = <optimized out> released = <optimized out> buf = <optimized out> conn = 0x7f2bf658b8d0 done = 0 sent = 0 __FUNCTION__ = "h2_send" __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> ---Type <return> to continue, or q <return> to quit--- __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> #4 0x000055d7a42734bd in h2_process (h2c=h2c@entry=0x7f2c18dca740) at src/mux_h2.c:3624 conn = 0x7f2bf658b8d0 __FUNCTION__ = "h2_process" __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> #5 0x000055d7a4276678 in h2_io_cb (t=<optimized out>, ctx=0x7f2c18dca740, status=<optimized out>) at src/mux_h2.c:3583 conn = 0x7f2bf658b8d0 tl = <optimized out> conn_in_list = 0 h2c = 0x7f2c18dca740 ret = <optimized out> __FUNCTION__ = "h2_io_cb" __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> __x = <optimized out> __l = <optimized out> #6 0x000055d7a4381f62 in run_tasks_from_lists (budgets=budgets@entry=0x7f2c3c51a35c) at src/task.c:454 process = <optimized out> tl_queues = <optimized out> t = 0x7f2c0d3fa1c0 budget_mask = 7 '\a' done = <optimized out> queue = <optimized out> state = <optimized out> ---Type <return> to continue, or q <return> to quit--- ctx = <optimized out> __ret = <optimized out> __n = <optimized out> __p = <optimized out> #7 0x000055d7a438282d in process_runnable_tasks () at src/task.c:679 tt = 0x55d7a47a6d00 <task_per_thread+1280> lrq = <optimized out> grq = <optimized out> t = <optimized out> max = {0, 0, 141} max_total = <optimized out> tmp_list = <optimized out> queue = 3 max_processed = <optimized out> #8 0x000055d7a4339467 in run_poll_loop () at src/haproxy.c:2942 next = <optimized out> wake = <optimized out> #9 0x000055d7a4339819 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3107 ptaf = <optimized out> ptif = <optimized out> ptdf = <optimized out> ptff = <optimized out> init_left = 0 init_mutex = pthread_mutex_t = {Type = Normal, Status = Not acquired, Robust = No, Shared = No, Protocol = None} init_cond = pthread_cond_t = {Threads known to still execute a wait function = 0, Clock ID = CLOCK_REALTIME, Shared = No} #10 0x00007f2cf1e606db in start_thread (arg=0x7f2c3c53d700) at pthread_create.c:463 pd = 0x7f2c3c53d700 now = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139827967416064, 7402574823425717764, 139827967272192, 0, 10, 140729081389088, -7430022153605859836, -7430137459590154748}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out> #11 0x00007f2cf0df671f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 No locals. (gdb) n h2_resume_each_sending_h2s (h2c=h2c@entry=0x7f2c18dca740, head=head@entry=0x7f2c18dcabf8) at src/mux_h2.c:3255 3255 TRACE_ENTER(H2_EV_H2C_SEND|H2_EV_H2S_WAKE, h2c->conn); (gdb) 3257 list_for_each_entry_safe(h2s, h2s_back, head, list) { (gdb) 3289 TRACE_LEAVE(H2_EV_H2C_SEND|H2_EV_H2S_WAKE, h2c->conn); (gdb) 3290 } (gdb) h2_process_mux (h2c=0x7f2c18dca740) at src/mux_h2.c:3330 3330 h2_resume_each_sending_h2s(h2c, &h2c->send_list); (gdb) 3334 if (h2c->st0 == H2_CS_ERROR) { (gdb) 3345 TRACE_LEAVE(H2_EV_H2C_WAKE, h2c->conn); (gdb) h2_send (h2c=h2c@entry=0x7f2c18dca740) at src/mux_h2.c:3478 3478 while (((h2c->flags & (H2_CF_MUX_MFULL|H2_CF_MUX_MALLOC)) == 0) && !done) (gdb) 3479 done = h2_process_mux(h2c); (gdb) 3482 done = 1; // we won't go further without extra buffers (gdb) 3484 if ((conn->flags & (CO_FL_SOCK_WR_SH|CO_FL_ERROR)) || (gdb) 3485 (h2c->st0 == H2_CS_ERROR2) || (h2c->flags & H2_CF_GOAWAY_FAILED)) (gdb) 3491 for (buf = br_head(h2c->mbuf); b_size(buf); buf = br_del_head(h2c->mbuf)) { (gdb) 3488 if (h2c->flags & (H2_CF_MUX_MFULL | H2_CF_DEM_MBUSY | H2_CF_DEM_MROOM)) (gdb) 3491 for (buf = br_head(h2c->mbuf); b_size(buf); buf = br_del_head(h2c->mbuf)) { (gdb) 3514 if (sent) (gdb) 3472 while (!done) { (gdb) 3518 if (conn->flags & CO_FL_SOCK_WR_SH) { (gdb) 3525 if (!(h2c->flags & (H2_CF_MUX_MFULL | H2_CF_DEM_MROOM)) && h2c->st0 >= H2_CS_FRAME_H) (gdb) 3526 h2_resume_each_sending_h2s(h2c, &h2c->send_list); (gdb) 3529 if (!br_data(h2c->mbuf)) { (gdb) 3530 TRACE_DEVEL("leaving with everything sent", H2_EV_H2C_SEND, h2c->conn); (gdb) 3541 } (gdb) h2_process (h2c=h2c@entry=0x7f2c18dca740) at src/mux_h2.c:3626 3626 if (unlikely(h2c->proxy->state == PR_STSTOPPED) && !(h2c->flags & H2_CF_IS_BACK)) { (gdb) 3643 if (!(h2c->flags & H2_CF_WAIT_FOR_HS) && (gdb) 3644 (conn->flags & (CO_FL_EARLY_SSL_HS | CO_FL_WAIT_XPRT | CO_FL_EARLY_DATA)) == CO_FL_EARLY_DATA) { (gdb) 3643 if (!(h2c->flags & H2_CF_WAIT_FOR_HS) && (gdb) 3659 if (conn->flags & CO_FL_ERROR || h2c_read0_pending(h2c) || (gdb) 3660 h2c->st0 == H2_CS_ERROR2 || h2c->flags & H2_CF_GOAWAY_FAILED || (gdb) 3659 if (conn->flags & CO_FL_ERROR || h2c_read0_pending(h2c) || (gdb) 3660 h2c->st0 == H2_CS_ERROR2 || h2c->flags & H2_CF_GOAWAY_FAILED || (gdb) 3661 (eb_is_empty(&h2c->streams_by_id) && h2c->last_sid >= 0 && (gdb) 3677 else if (h2c->st0 == H2_CS_ERROR) { (gdb) 3684 if (!b_data(&h2c->dbuf)) (gdb) 3687 if ((conn->flags & CO_FL_SOCK_WR_SH) || (gdb) 3688 h2c->st0 == H2_CS_ERROR2 || (h2c->flags & H2_CF_GOAWAY_FAILED) || (gdb) 3687 if ((conn->flags & CO_FL_SOCK_WR_SH) || (gdb) 3688 h2c->st0 == H2_CS_ERROR2 || (h2c->flags & H2_CF_GOAWAY_FAILED) || (gdb) 3690 !br_data(h2c->mbuf) && (gdb) 3689 (h2c->st0 != H2_CS_ERROR && (gdb) 3690 !br_data(h2c->mbuf) && (gdb) 3691 (h2c->mws <= 0 || LIST_ISEMPTY(&h2c->fctl_list)) && (gdb) 3692 ((h2c->flags & H2_CF_MUX_BLOCK_ANY) || LIST_ISEMPTY(&h2c->send_list)))) (gdb) 3680 MT_LIST_DEL((struct mt_list *)&conn->list); (gdb) 3693 h2_release_mbuf(h2c); (gdb) 3695 if (h2c->task) { (gdb) 3696 if (h2c_may_expire(h2c)) (gdb) 3697 h2c->task->expire = tick_add(now_ms, h2c->last_sid < 0 ? h2c->timeout : h2c->shut_timeout); (gdb) 3700 task_queue(h2c->task); (gdb) 3703 h2_send(h2c); (gdb) 3704 TRACE_LEAVE(H2_EV_H2C_WAKE, conn); (gdb) 3705 return 0; (gdb) 3704 TRACE_LEAVE(H2_EV_H2C_WAKE, conn); (gdb) 3706 } (gdb) h2_io_cb (t=<optimized out>, ctx=0x7f2c18dca740, status=<optimized out>) at src/mux_h2.c:3590 3590 if (!ret && conn_in_list) { (gdb) 3600 TRACE_LEAVE(H2_EV_H2C_WAKE); (gdb) 3602 } (gdb) run_tasks_from_lists (budgets=budgets@entry=0x7f2c3c51a35c) at src/task.c:456 456 sched->current = NULL; (gdb) 455 done++; (gdb) 456 sched->current = NULL; (gdb) 457 __ha_barrier_store(); (gdb) 458 continue; (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1))) { (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 424 if (LIST_ISEMPTY(&tl_queues[queue])) { (gdb) 430 if (!budgets[queue]) { (gdb) 436 budgets[queue]--; (gdb) 442 ctx = t->context; (gdb) 443 process = t->process; (gdb) 436 budgets[queue]--; (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *, list); (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 441 activity[tid].ctxsw++; (gdb) 444 t->calls++; (gdb) 445 sched->current = t; (gdb) 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 450 LIST_DEL_INIT(&((struct tasklet *)t)->list); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 451 __ha_barrier_store(); (gdb) 452 state = _HA_ATOMIC_XCHG(&t->state, state); (gdb) 454 process(t, ctx, state); (gdb) 456 sched->current = NULL; (gdb) 455 done++; (gdb) 456 sched->current = NULL; (gdb) 457 __ha_barrier_store(); (gdb) 458 continue; (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1))) { (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 424 if (LIST_ISEMPTY(&tl_queues[queue])) { (gdb) 430 if (!budgets[queue]) { (gdb) 436 budgets[queue]--; (gdb) 442 ctx = t->context; (gdb) 443 process = t->process; (gdb) 436 budgets[queue]--; (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *, list); (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 441 activity[tid].ctxsw++; (gdb) 444 t->calls++; (gdb) 445 sched->current = t; (gdb) 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 450 LIST_DEL_INIT(&((struct tasklet *)t)->list); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 451 __ha_barrier_store(); (gdb) 452 state = _HA_ATOMIC_XCHG(&t->state, state); (gdb) 454 process(t, ctx, state); (gdb) 456 sched->current = NULL; (gdb) 455 done++; (gdb) 456 sched->current = NULL; (gdb) 457 __ha_barrier_store(); (gdb) 458 continue; (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1))) { (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 424 if (LIST_ISEMPTY(&tl_queues[queue])) { (gdb) 430 if (!budgets[queue]) { (gdb) 436 budgets[queue]--; (gdb) 442 ctx = t->context; (gdb) 443 process = t->process; (gdb) 436 budgets[queue]--; (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *, list); (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 441 activity[tid].ctxsw++; (gdb) 444 t->calls++; (gdb) 445 sched->current = t; (gdb) 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 450 LIST_DEL_INIT(&((struct tasklet *)t)->list); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 451 __ha_barrier_store(); (gdb) 452 state = _HA_ATOMIC_XCHG(&t->state, state); (gdb) 454 process(t, ctx, state); (gdb) 456 sched->current = NULL; (gdb) 455 done++; (gdb) 456 sched->current = NULL; (gdb) 457 __ha_barrier_store(); (gdb) 458 continue; (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1))) { (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 424 if (LIST_ISEMPTY(&tl_queues[queue])) { (gdb) 430 if (!budgets[queue]) { (gdb) 436 budgets[queue]--; (gdb) 442 ctx = t->context; (gdb) 443 process = t->process; (gdb) 436 budgets[queue]--; (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *, list); (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 441 activity[tid].ctxsw++; (gdb) 444 t->calls++; (gdb) 445 sched->current = t; (gdb) 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 450 LIST_DEL_INIT(&((struct tasklet *)t)->list); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 451 __ha_barrier_store(); (gdb) 452 state = _HA_ATOMIC_XCHG(&t->state, state); (gdb) 454 process(t, ctx, state); (gdb) 456 sched->current = NULL; (gdb) 455 done++; (gdb) 456 sched->current = NULL; (gdb) 457 __ha_barrier_store(); (gdb) 458 continue; (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1))) { (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 424 if (LIST_ISEMPTY(&tl_queues[queue])) { (gdb) 430 if (!budgets[queue]) { (gdb) 436 budgets[queue]--; (gdb) 442 ctx = t->context; (gdb) 443 process = t->process; (gdb) 436 budgets[queue]--; (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *, list); (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 441 activity[tid].ctxsw++; (gdb) 444 t->calls++; (gdb) 445 sched->current = t; (gdb) 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 450 LIST_DEL_INIT(&((struct tasklet *)t)->list); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 451 __ha_barrier_store(); (gdb) 452 state = _HA_ATOMIC_XCHG(&t->state, state); (gdb) 454 process(t, ctx, state); (gdb) 456 sched->current = NULL; (gdb) 455 done++; (gdb) 456 sched->current = NULL; (gdb) 457 __ha_barrier_store(); (gdb) 458 continue; (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1))) { (gdb) 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) { (gdb) 424 if (LIST_ISEMPTY(&tl_queues[queue])) { (gdb) 430 if (!budgets[queue]) { (gdb) 436 budgets[queue]--; (gdb) 442 ctx = t->context; (gdb) 443 process = t->process; (gdb) 436 budgets[queue]--; (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *, list); (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED); (gdb) 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running (gdb) 441 activity[tid].ctxsw++; (gdb) 444 t->calls++; (gdb) 445 sched->current = t; (gdb) 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 450 LIST_DEL_INIT(&((struct tasklet *)t)->list); (gdb) 449 if (TASK_IS_TASKLET(t)) { (gdb) 451 __ha_barrier_store(); (gdb) 452 state = _HA_ATOMIC_XCHG(&t->state, state); (gdb) 454 process(t, ctx, state); (gdb) 456 sched->current = NULL; (gdb) 455 done++; (gdb) 456 sched->current = NULL; (gdb) 457 __ha_barrier_store(); (gdb) q A debugging session is active. Inferior 1 [process 20598] will be detached. Quit anyway? (y or n) y Detaching from program: /usr/sbin/haproxy, process 20598 czw., 25 mar 2021 o 13:51 Christopher Faulet <cfau...@haproxy.com> napisał(a): > Le 25/03/2021 à 13:38, Maciej Zdeb a écrit : > > Hi, > > > > I deployed a patched (with volatile hlua_not_dumpable) HAProxy and so > far so > > good, no looping. Christopher I saw new patches with hlua_traceback used > > instead, looks much cleaner to me, should I verify them instead? :) > > > > Christopher & Willy I've forgotten to thank you for help! > > > Yes please, try the last 2.2 snapshot. It is a really a better way to fix > this > issue because the Lua traceback is never ignored. And it is really safer > to not > allocate memory in the debugger. > > So now, we should be able to figure out why the Lua fires the watchdog. > Because, > under the hood, it is the true issue :) > > -- > Christopher Faulet >