On Friday, April 1, 2016 at 10:42:32 PM UTC+9, Charles-François Natali 
wrote:
>
> EPOLLEXCLUSIVE is mostly useful in a multi-threaded context, to avoid 
> the thundering herd problem when a FD is registered in several poll 
> sets managed by several threads, 
>
> The above benchmark you give is for uWSGI and meinheld which are 
> implemented in C, probably using multiple threads, 
>

EPOLLEXCLUSIVE is usuful in multi process (prefork) servers too. 
uWSGI supports both of prefork and thread, but I tested in single thread 
mode (pure prefork).


> I'm not opposed to adding this to the selectors, but arguably 
> selectors are made to be used from Python, where the GIL is going to 
> limit you much before the thundering herd. 
>

In case of prefork worker, GIL is not problem.
In prefork server written in C, context switch caused by useless wakeup (= 
thundering herd) is problem.
In prefork server written in Python, overhead of context switch is 
relatively small.  But useless code execution
(epoll ~ callback ~ accept ~ EAGAIN ~ epoll) is significant overhead.
 

>
> Do you actually have a benchmark showing an improvement? 
>

gunicorn is popular web server implemented in Python.  It has some "worker 
class".

"gthread" worker class handles connections by asyncio and process request 
in thread pool.

On c4.xlarge (quad core) machine,

gunicorn -k gthread -w4 -b :8000 helloapp  # number of process == number of 
cores
ab -c1 -n10000 http://127.0.0.1:8000/

result is about 2.2k req/sec.

gunicorn -k gthread -w16 -b :8000 helloapp  # number of process == number 
of cores * 4
ab -c1 -n10000 http://127.0.0.1:8000/

result is about 1.9k req/sec.

So I can see about 10% loss caused by thundering herd.
And when I tried EPOLLEXCLUSIVE, performance of 16 worker case is same to 4 
worke case.

I'll try aiohttp when I have time.

Reply via email to