Hi William, On Tue, Apr 09, 2019 at 01:54:03PM +0000, William Dauchy wrote: > Hello, > > Probably a useless report as I don't have a lot information to provide, > but we faced an issue where the unix socket was unresponsive, with the > processes using all cpu (1600% with 16 nbthreads) > > I only have the backtrace of the main process but lost the backtrace of > all threads (nbthread 16). I was also unable to get a response from the > socket.
Did you issue one of the commands that tries to be alone, thus "show sess" or "show fd" ? It's possible that you were having only one thread blocked initially and that with the command that was waiting for all threads to stop, at some point they all wake up to wait and all eat your CPU. Thus the real issue to figure is why one thread was blocked. > (gdb) bt > #0 0x00005636716d7fbe in fwrr_set_server_status_up (srv=0x5636928e6700) at > src/lb_fwrr.c:112 This is a spinlock so apparently it is one of the culprits. The problem is that I see no other place where this lock is taken and not restored in this code part. So it might have happened in another function. If you ever manage so see this again and obtain a core, I'm interested in seeing all threads' backtraces. In the mean time I'm carefully rechecking all locked functions. Thanks, Willy