http://www.ioremap.net/node/200#comment-571

Morning exercise: port multithreaded project to pool of libevent-backed threads

Tagged:  

Everyone knows libevent, yeah?
If you do not, then likely you will not ever write client-server applications using event-driven model, since libevent is effectively the most known and feature-rich library for this purpose.

So, yesterday I decided to try to port elliptics network from thread-based model to event-driven one. Sounds simple...
But it took me two days to complete and still I'm not sure if transaction forwarding (when server receives data which belongs to some other node in the network it forwards it there) works without bugs and what is the performance.

The main problem with libevent, is that it was designed for single-threaded model. But reality says us that having single IO thread does not scale well for storage systems (and polling does not work well for usual files, not even talking about weird cases when we want to store data in the database where there is no file descriptor polling at all), so no matter what, we have to have number of IO threads.

Here comes the first problem. Although libevent claims to have experimental support for the thread-safe dispatching and event queueing, it does not work, when event base (provided by event_init()) is created in one thread and then another one starts adding/dispatching events to/from it. Version I tested in Ubuntu Hardy (1.3.something I think) just crashed. So I moved base allocation into own thread and started to use the latest 1.4.9 version. It works fine, except that dispatching code returns error when there are no events to wait for.
This is a generic library, which does not allow to sleep until some events are ready, if there are no events, so effectively one either has to add dummy descriptor (like pipe) or sleep when event_dispatch() and friends return 1 (no events).
Anoter problem is with event_loopexit() and other _loopexit() functions - they allocate event which I never managed to find freed, but I may be wrong. Initially I thought that this function should be used instead of event_dispatch() to wait for ready file descriptor, but with timeout. Looks like I was wrong, but I did not investigate it deep enough.

But all those libevent problems are not real problems - it is just a matter of usage experience. Effectively it took me two days to resolve all the issues I worked with.

The more serious one is event-driven design itself. TCP socket in Linux is more than 1500 bytes already, file descriptor is close to 200 bytes. Task structure is about 3800 bytes. Noticebly more, but the difference will dissapear compared to IO overhead. And thread-driven model is much simpler to programm.
For example in my network protocol there is a header, which contains information about attached data. So we have to read two blocks from the network, similar issues may also appear in the sending part, and to resolve them we have to have a rather complex state machine.

Very serious problem rises when we want to manage the same file descriptor from the multiple threads - for example two threads wait on socket, awake and each one reads a request to be processed in parallel. This does not really work with libevent, since it does not contain any lock inside, so we will have to introduce synchronization primitive on the higher layer. POSIX thread locks in contention case end up in kernel with sys_futex() syscall, which will happen in this case almost instantly.

Another solution is to have a single thread to accept new clients and dynamically spread new file descriptors among threads in the IO pool. I made a weird decision to push receiving work to one thread and sending work to another one. This simplifies data forwarding, but forces threads to allocate a special control block for each packet to be sent. They are placed into FIFO queue and thread operates on the first element without locks (only to dequeue when it has been sent). I selected this model, for data forwarding we have to put a given data block into different socket, which is handled by the different thread, so to be completely non-blocking we have to invent a queue for each thread, which will be drained when appropiate socket is ready to be written to.
Right now I do not remember why I did not put each thread to handle both receiving and sending work (including checking the forwarding queue), maybe will experiment with this model next time, since I really do not like the idea of allocation/freeing of the control block for each packet to be sent.

I will check how this work in the multiple-server and multiple-client workload, but so far I like the way it behaves in a single server - single client case over loopback on my laptop with 256 MB or RAM :)
Changes are available in the git tree.

Getting the local time, I already do not have time for the electronic experiments. I wanted to attach K8055 IO board to sample Dallas' one-wire bus (w1 in my notation) between ibutton and ds2490 for example and check if it can sniff the data channel. I'm not sure it will be able though: first, I do not remember how fast its digital inputs can be sampled from the software; second, I'm not sure ds2490 chip will allow this passive load. It has a capacity, so w1 bus master may decide to start negotiation protocol with it, since it will think this is a proper client.

If this succeeds, I will move further and order some PIC-based board to work with and eventually make some kind of w1 sniffer myself.

Why not libev?

I checked it too, but it does not exist in the debian/ubuntu repos, so it is not convenient to use for the out of the box solution.

Yes, libev is in debian repos (and so I guess in ubuntu).

Check: http://packages.qa.debian.org/libe/libev.html
And searching in packages.ubuntu.com I found: http://packages.ubuntu.com/intrepid/libev3

It seems to be in intrepid but not on hardy. But you know you can just add the deb-src sources and 'apt-get source libev3; apt-get build-dep libev3; cd libev3-*; dpkg-buildpackage -rfakeroot' (I dont remember exactly, but something like this should work to 'backport' the package. Perhaps there is some build dep you need to backport too, or something like that).

Also, Debian's (ergo ubunut's) version are kind of old, I already reported the bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=518817. If you hit some bug already fixed you may want to make some preasure too ;)

I found it only in Lenny and Intrepid. Curious if it has similar to libevent threading 'support'.

Yes, see: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#THREADS_AND_COROUTINES

And it seems it works. And, in my experience I reported problems and have been fixed very quickly, so if it does not work as documented just report it and I *think* it would be solved.

Does it allow to queue event from the different thread? I.e. when one thread process the event loop and receives a message to be forwarded into different file descriptor (event), but that event is managed by the different thread. According to the description libev does not have any internal locks, so it is forbidden to add multiple events into the same queue in parallel.

Another question: is it allowed to add the same event multiple times, namely once from the processing loop itself and another one when we want some new message to be sent via the given socket. Again looking at the description, libev does not guard event insert, so it is not allowed to add the same event in parallel and likely sequential too.

Effectively, I need my own (rather trivial) event processing code.

> Does it allow to queue event from the different thread?

No, you must mutex the access from different threads (If I'm not wrong :)

> Another question: is it allowed to add the same event multiple times, namely once from the processing loop itself and another > one when we want some new message to be sent via the given socket

I don't really understand the question (perhaps I dont really know code ?) :S

An event is an struct, you allocate it, and asociate a callback with it. So... what do you mean with the same "event" ? The event of beig able to read from a file descriptor (for example) ? or do you mean the struct ?

> Effectively, I need my own (rather trivial) event processing code.

Is really trivial ? (again, I dont really know what subset of features you need :)

By adding the same event into the queue I meant the same structure to be queued, i.e. when we want to send data from the processing thread and from another one (which forwards data into given socket), so we readd event from the main thread when its callback is invoked, queue some work from the different thread and add the same event second time from that thread (each insertion is guarded by some lock of course). For example if event is already added and we queue it again libevent will exit - not event return with error, but blindly exit the whole process.

My own event procesing just needs to dispatch ready sockets and send/receive data to/from them. No timeouts or some other fancy stuff libevent and libev can do. Timeouts will be managed by TCP layer (if TCP socket is used) or at the transaction layer (not implemented though right now).

I have it on my Gutsy

What's the current state of the art in web servers? A large amount of research has been done on how to build the fastest possible web server. Lighttpd and Cherokee are two very fast ones. Their code base is also quite small. Lighttpd was behind some of the performance tweaks done for adaptive read ahead in Linux. But in general, AFAIK none of the current generation of web servers use a large numbers of threads, they are all using some form of an event driven model.

I saw some emails implying that these servers aren't using libevent any more and have recoded event handling internally. I didn't read any further.

All those servers were created way before Linux in particular and *nix in general got proper threading support.
But I agree that event-driven model is a very high-performance one (especially when scheduling features come in).

Lighttpd and Cherokee are two very fast ones

or Russian production: nginx

I've been giving libevent a chance in every C network project I've done, and I always resort to writing the code manually instead, or making my own internal library. libevent is adequate for a quick throw-away program (the type that should be done with Twisted Python instead), but for anything serious it always ends up better to suit something to the project at hand. That's a sign of a bad library.


Reply via email to