Re: High load average under 1.8 with multiple draining processes

Willy Tarreau Fri, 12 Jan 2018 06:56:30 -0800

Hi Samuel,

On Thu, Jan 11, 2018 at 08:29:15PM -0600, Samuel Reed wrote:
> Is there a regression in the 1.8 series with SO_REUSEPORT and nbthread
> (we didn't see this before with nbproc) or somewhere we should start
> looking?


In fact no, nbthread is simply new so it's not a regression but we're
starting to see some side effects. One possibility I can easily imagine
is that at most places we're using spinlocks because the locks are very
short-lived and very small, so they have to be cheap. One limit of
spinlocks is that it's mandatory that you don't have more threads than
cores so that your threads are never scheduled out with a lock held, to
let another one spin for nothing for a timeslice.

The reload makes an interesting case because if you use cpumap to bind
your threads to CPU cores, during the soft-stop period, they do have to
coexist on the same cores and a thread of one process disturbs the thread
of another process by stealing its CPU regularly.

I can't say I'm seeing any easy solution to this in the short term, that's
something we have to add to the list of things to improve in the future.
Maybe something as simple as starting with SCHED_FIFO to prevent threads
from being preempted outside of the poll loop, and dropping it upon reload
could help a lot, but that's just speculation.

We'll have to continue to think about this I guess. It may be possible
that if your old processes last very long you'd continue to get a better
experience using nbproc than nbthread :-/

Willy

Re: High load average under 1.8 with multiple draining processes

Reply via email to