I was interested in the max attainable throughput of libevent when using bufferevents, so I modified the le-proxy sample into a small performance test program, pushing 64K buffers as fast as possible over a single full duplex connection.
I was a little bit disappointed by the results initially when running the test on localhost, compared to doing the same using a simple program using select (and large buffers in the read call!) directly: Linux (virtual machine) : 380 Mb/s (cf. 1.45 Gb/s using select directly) Windows : 82 Mb/s (cf. 100 Mb/s using select directly) Windows (IOCP) : 93 Mb/s So, I decided to do a little bit of profiling on my Windows box, and noticed a lot of time spent in grabbing the locks in read_complete and write_complete in bufferevent_async.c. After some further digging I noticed a couple of fixed buffer sizes limited to 16K and 4K in bufferevent_async.c bufferevent_ratelim.c buffer.c (some of which were marked FIXME, so I must be onto something ;) When I changed all sizes to 64K, this gave quite a bit of improvement: Linux (virtual machine) : 1220 Mb/s Windows (select) : 96 Mb/s Windows (IOCP) : 125 Mb/s That's pretty close to the numbers when using select directly, and exceeding that when using IOCP on Windows. In the latter case, the time spent in acquiring the locks was also considerably less. Some further experiments, playing with both the buffer sizes used by my test program as well as the internal libevent buffer sizes, seems to indicate that 64K gives the best performance throughout. Now I'not saying that 64K in internal buffers would be a good price to pay for every type of connection, but I can imagine that when throughput is important it would be nice if one could change these fixed buffer sizes through an API. Any thoughts? Cheers, Marcel Roelofs _______________________________________________ Libevent-users mailing list Libevent-users@monkey.org http://lists.monkey.org:8080/listinfo/libevent-users