Hi again, On Thu, Aug 16, 2018 at 05:50:27PM +0200, Olivier Houchard wrote: > Hi Pieter, > > On Thu, Aug 16, 2018 at 12:24:04AM +0200, PiBa-NL wrote: > > Hi List, > > > > Anyone got a idea how to debug this further? > > Currently its running at 100% again, any pointers to debug the process as > > its running would be appreciated. > > > > Or should i compile again from current master and 'hope' it doesn't return? > > > > b.t.w. truss output is as follows: > > kevent(3,0x0,0,{ },200,{ 0.000000000 }) = 0 (0x0) > > kevent(3,0x0,0,{ },200,{ 0.000000000 }) = 0 (0x0) > > kevent(3,0x0,0,{ },200,{ 0.000000000 }) = 0 (0x0) > > kevent(3,0x0,0,{ },200,{ 0.000000000 }) = 0 (0x0) > > kevent(3,0x0,0,{ },200,{ 0.000000000 }) = 0 (0x0) > > kevent(3,0x0,0,{ },200,{ 0.000000000 }) = 0 (0x0) > > kevent(3,0x0,0,{ },200,{ 0.000000000 }) = 0 (0x0) > > > > Regards, > > PiBa-NL (Pieter) > > I'm interested in figuring that one out. > From the look at it, it seems the scheduler thinks there's a task to be run, > and so won't let the poller sleep in kevent(). > Can you > - update to the latest master, even though I don't think any relevant fix > was applied since Jul 30. > - compile it with -O0, so that we can get meaningful informations from gdb. > - When/if that happens again, getting a core, and send it to me with the > haproxy binary, assuming there's no confidential information in that core, > of course. > > Thanks ! >
So after discussing on IRC, I'm pretty sure I figured it out. The two attached patches should fix it. Thanks a lot ! Olivier
>From 90fc92f777772c6b47d88769bb73680702d7b8e6 Mon Sep 17 00:00:00 2001 From: Olivier Houchard <ohouch...@haproxy.com> Date: Thu, 16 Aug 2018 19:03:02 +0200 Subject: [PATCH 1/2] BUG/MEDIUM: tasks: Don't insert in the global rqueue if nbthread == 1 Make sure we don't insert a task in the global run queue if nbthread == 1, as, as an optimisation, we avoid reading from it if nbthread == 1. --- src/task.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/task.c b/src/task.c index de097baf7..e357bc169 100644 --- a/src/task.c +++ b/src/task.c @@ -395,7 +395,8 @@ void process_runnable_tasks() state = HA_ATOMIC_AND(&t->state, ~TASK_RUNNING); if (state) #ifdef USE_THREAD - __task_wakeup(t, (t->thread_mask == tid_bit) ? + __task_wakeup(t, (t->thread_mask == tid_bit || + global.nbthread == 1) ? &rqueue_local[tid] : &rqueue); #else __task_wakeup(t, &rqueue_local[tid]); -- 2.14.3
>From 7640aa3de3c9ad00fe82cda4a50351e46fc0bf48 Mon Sep 17 00:00:00 2001 From: Olivier Houchard <ohouch...@haproxy.com> Date: Thu, 16 Aug 2018 19:03:50 +0200 Subject: [PATCH 2/2] BUG/MEDIUM: sessions: Don't use t->state. In session_expire_embryonic(), don't use t->state, use the "state" argument instead, as t->state has been cleaned before we're being called. --- src/session.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/session.c b/src/session.c index 1d66b739f..d7d8544c7 100644 --- a/src/session.c +++ b/src/session.c @@ -388,7 +388,7 @@ static struct task *session_expire_embryonic(struct task *t, void *context, unsi { struct session *sess = context; - if (!(t->state & TASK_WOKEN_TIMER)) + if (!(state & TASK_WOKEN_TIMER)) return t; session_kill_embryonic(sess); -- 2.14.3