Hi Samuel, On Thu, Jan 11, 2018 at 08:29:15PM -0600, Samuel Reed wrote: > Is there a regression in the 1.8 series with SO_REUSEPORT and nbthread > (we didn't see this before with nbproc) or somewhere we should start > looking?
In fact no, nbthread is simply new so it's not a regression but we're starting to see some side effects. One possibility I can easily imagine is that at most places we're using spinlocks because the locks are very short-lived and very small, so they have to be cheap. One limit of spinlocks is that it's mandatory that you don't have more threads than cores so that your threads are never scheduled out with a lock held, to let another one spin for nothing for a timeslice. The reload makes an interesting case because if you use cpumap to bind your threads to CPU cores, during the soft-stop period, they do have to coexist on the same cores and a thread of one process disturbs the thread of another process by stealing its CPU regularly. I can't say I'm seeing any easy solution to this in the short term, that's something we have to add to the list of things to improve in the future. Maybe something as simple as starting with SCHED_FIFO to prevent threads from being preempted outside of the poll loop, and dropping it upon reload could help a lot, but that's just speculation. We'll have to continue to think about this I guess. It may be possible that if your old processes last very long you'd continue to get a better experience using nbproc than nbthread :-/ Willy