On Fri, Apr 20, 2018 at 09:41:08AM +0300, Slawa Olhovchenkov wrote:
> On Fri, Apr 20, 2018 at 08:22:04AM +0200, Willy Tarreau wrote:
> 
> > On Fri, Apr 20, 2018 at 09:11:47AM +0300, Slawa Olhovchenkov wrote:
> > > > Try 1.8.8, it contains the kqueue fix.
> > > 
> > > Work (kqueue), nice!
> > 
> > Excellent, thanks for your feedback!
> 
> Thank for fix!
> 
> > > Average load same as for multiprocess, but load of distinct CPU from
> > > 0.06 to 0.39. Is this normal or expected?
> > 
> > It can depend on your workload. There is *always* a small overhead
> > incured by thread synchronization that doesn't exist between processes,
> > but the ability to share certain elements can sometimes be beneficial
> > as well (eg: shared cache in CPU).
> 
> Hmm, may be I am nor clean.
> In process mode all 8 CPU have load 0.18. In thread mode summary
> average load still about 0.18, but distinct CPU load now different:
> 
> 0: 0.13
> 1: 0.15
> 2: 0.07
> 3: 0.40
> 4: 0.23
> 5: 0.33
> 6: 0.16
> 7: 0.15
> 
> Average (0.20) is about same as 0.18 (rised by more users now)

Oh I think I understand. In multi-process mode you probably had several
listening sockets, one per process. But in multi-thread mode you likely
have a single one. Given that the connections are not migrated in 1.8,
if a thread accepts a burst of connections it will have to process them.

What you can do is to keep multiple listeners, each bound to a different
thread, exactly like you did with processes :

   bind :80 ... process 1/1
   bind :80 ... process 1/2
   bind :80 ... process 1/3
   bind :80 ... process 1/4
   ...
   bind :80 ... process 1/8

I think we should post an article to explain how to optimize configs for
threads, especially when coming from nbproc. Also, one very important
thing I don't mention enough is that you must never ever have more
threads than enabled CPUs. The threads are meant to be fast and almost
never interrupted outside of poll(). If a context switch happens while
a thread holds a lock, it will stall other threads' processing.

> > If you run at high connection rates (tens of connections per second) on
> > 8+ threads, the cost of locking starts to be quite visible. That's where
> > we know that threads scale less than processes (and what we're improving
> > in 1.9). But the as long as you have some idle time, threads are much more
> > convenient to use (single stats, checks, tables etc) and should be 
> > preferred.
> 
> connection rate 1900/s
> session rate 1900/s
> request rate 5200/s

OK, at this rate the locking has no reason to be noticeable.

Willy


Reply via email to