On Sun, Jan 22, 2012 at 9:44 AM, Frank Schoep <fr...@ffnn.nl> wrote: > On 19 jan 2012, at 21:53, Nick Mathewson wrote: >> … >> The usual way is with events that you manually activate. You'll need >> to use libevent 2 for threading support, and call one of the >> appropriate functions at startup time to turn on threading support. > > I've been thinking about this over the past days and I wondered – how do > POSIX signal callbacks behave under multithreaded conditions?
Sadly, this whole business is a huge pile of gunk. The interaction between signals and posix threads (without even bringing Libevent into the picture) is really, really stupid, mainly because the original pthreads spec is designed to allow pure-userspace implementations where all threads share one process, as well as pure-userspace implementations where each thread _is_ a process. Basically, under pthreads, any signal sent to the _process_ is received by an *arbitrary* thread. It could be the same thread every time; it could be a randomly chosen thread. (BTW, some details above are surely wrong: sometimes I feel like there is an evil conspiracy that changes signal semantics around behind my back whenever I'm not looking... but really, they're just hard.) > Could I add a signal callback to the libevent loop running on one thread and > have another thread raise that signal, invoking the callback on the other > thread? Do I still need to use threading support in that case, or will the > raised signal always properly register in a 'vanilla' libevent loop? Has > anyone ever tried this? This wouldn't actually be any faster than the event_active() approach. In most cases[*], Libevent handles signals by installing a signal handler with signal(). The handler uses a socketpair to communicate with the event base, so no matter what thread it gets run in, the event_base finds out. In comparison, when using event_active() to tell the event_base to do something, you're using the (more optimized)[***] "notification" pathway, which doesn't need signals at all, and sometimes doesn't even need to hit kernelspace. So it's worth benchmarking, but I'd be surprised if you got a big speed improvement there. [*] kqueue is an exception, since kqueue can handle signals on its own. I'd like to be using signalfd on Linux too, but there are technical challenges there. See the sourceforge ticket [**] for more info. [**] https://sourceforge.net/tracker/?func=detail&atid=461324&aid=3436038&group_id=50884 [***] The internal "evthread_notify_base()" function is used to tell an event_base that it should wake up from its loop (if it's in one) and handle changes to the list of active or pending events made by another thread. The code uses signalfd where available, pipes when signalfd is missing (Hm, signals should do that), and socketpair as a final resort. It also takes pains to avoid redundant wakeups: if the base is already going to wake up for some other reason, it doesn't poke the socketpair/pipe/signalfd again. > My (naive) assumption is that, since signals work at the process level, any > of its pthreads will potentially become aware of a signal shortly after it is > raised, because the process-to-pthread(s) mapping would act as an improvised > mutex at the OS/scheduler level. > > Does that assumption make any sense, is it in use already by existing > applications? Do Windows threads, using _beginthreadex, behave differently > (like almost everything on Windows) compared to UNIX-like systems? Should I > file for a software patent (j/k) or throw this idea in the trash bin? > > I'm sorry if I seem to be overengineering my application's design at the > moment, but I really want to try to keep all inner loops lock free to > maximize throughput. Although pthread mutex locking is fairly fast, skipping > it altogether would be even faster, I think. My understanding is that with a good implementation, pthread mutex locking is blazingly fast *on the uncontested case*. So what you ought to be worrying about is not "how often do I lock/unlock", but rather "Is the lock contested"? So instead, maybe the right approach is to make sure that you aren't actually hitting lock contention here, and only optimize more if so. Of course, as with all other performance discussions, testing is king. The "queue work for the main event_base thread" pattern is pretty common in multithreaded libevent programming. If we can come up with some reasonable benchmarks for that, I'd like to try to find good ways to optimize for it. cheers, -- Nick *********************************************************************** To unsubscribe, send an e-mail to majord...@freehaven.net with unsubscribe libevent-users in the body.