I was interested in the max attainable throughput of libevent when using 
bufferevents, so I modified the le-proxy sample into a small performance test 
program, pushing 64K buffers as fast as possible over a single full duplex 

I was a little bit disappointed by the results initially when running the test 
on localhost, compared to doing the same using a simple program using select 
(and large buffers in the read call!) directly: 

Linux (virtual machine) : 380 Mb/s  (cf. 1.45 Gb/s using select directly)
Windows                 :  82 Mb/s  (cf.  100 Mb/s using select directly)
Windows (IOCP)          :  93 Mb/s

So, I decided to do a little bit of profiling on my Windows box, and noticed a 
lot of time spent in grabbing the locks in read_complete and write_complete in 
bufferevent_async.c. After some further digging I noticed a couple of fixed 
buffer sizes limited to 16K and 4K in 
(some of which were marked FIXME, so I must be onto something ;)

When I changed all sizes to 64K, this gave quite a bit of improvement:

Linux (virtual machine) : 1220 Mb/s
Windows (select)        :   96 Mb/s
Windows (IOCP)          :  125 Mb/s

That's pretty close to the numbers when using select directly, and exceeding 
that when using IOCP on Windows. In the latter case, the time spent in 
acquiring the locks was also considerably less. 

Some further experiments, playing with both the buffer sizes used by my test 
program as well as the internal libevent buffer sizes, seems to indicate that 
64K gives the best performance throughout.

Now I'not saying that 64K in internal buffers would be a good price to pay for 
every type of connection, but I can imagine that when throughput is important 
it would be nice if one could change these fixed buffer sizes through an API.

Any thoughts?

Marcel Roelofs

Libevent-users mailing list

Reply via email to