On Wed, Jan 18, 2012 at 12:56 PM, Myk Taylor <myk...@yahoo.com> wrote: > Hi all, > > I've done a bit more debugging on this, and after reproducing a few > segfaults (different from the original assert), it seems like the problem is > related to the issues Zhuang Yuyao was having a year and a half ago: > > http://archives.seul.org/libevent/users/Jun-2010/msg00002.html > > It is the same situation for me: I am using openssl bufferevents and > BEV_OPT_DEFER_CALLBACKS is set. I am also beset with warnings like: > > Epoll MOD(4) on fd 471 failed. Old events were 6; read change was 2 (del); > write change was 0 (none): Bad file descriptor
This one usually means that it tried to event_del() an event for a fd that had already closed. Assuming that you're using the regular epoll backend (and not the epoll_changelist one), that's probably a bug in somewhere. > However those warnings are not always coincidental with a segfault. Here is > a representative backtrace: > > #0 0x0000000000000000 in ?? () > #1 0x00007ffff7ba1449 in evbuffer_free (buffer=0x7afee0) at buffer.c:568 > #2 0x00007ffff7ba5dc2 in _bufferevent_decref_and_unlock (bufev=0x7b38a0) at > bufferevent.c:629 > #3 0x00007ffff7b9cb4b in event_process_deferred_callbacks > (breakptr=0x71b530, queue=0x71b558) at event.c:1364 > #4 event_process_active (base=<optimized out>) at event.c:1403 > #5 event_base_loop (base=0x71b450, flags=<optimized out>) at event.c:1589 > #6 0x000000000040380e in main (argc=8, argv=0x7fffffffd6b8) at ... > > frame 0 is due to the lock callback being null in EVLOCK_LOCK. I note that > all the segfaults I have cores for involve > event_process_deferred_callbacks(). Is your code multithreaded? I'm guessing so, from the EVBUFFER_LOCK call. It's very weird that the "lockvar" (evbuffer->lock) would be set, but _evthread_mode_fns.lockfn would be NULL. Have you tried looking at the contents of *buffer in one of these cores? Does it look like it's been trashed? > Could it be that the openssl bufferevent isn't canceling pending callbacks > when it is destroyed? Alternately, are there assumptions about when a > bufferevent can be destroyed that I am violating? It's possible! My first guess here would be some reference counting mistake in the code someplace. If this is not so hard to reproduce, a couple of things to try: * Run it under valgrind. If you're using valgrind and openssl at the same time, you'll need to pass --undef-value-errors=no to valgrind, or rebuild openssl with -DPURIFY. * Add a memory poisoning step before freeing an evbuffer or bufferevent. (IOW, memset the thing to 0xb0 or something so that you can more easily find any attempts to access a freed object) hth, -- Nick *********************************************************************** To unsubscribe, send an e-mail to majord...@freehaven.net with unsubscribe libevent-users in the body.