On Thu, Feb 12, 2015 at 09:16:20AM -0800, Marco Graziano wrote: > Hi Azat,
Hi Marco, > > The backtraces I included are from the only threads waiting while holding a > lock. The other threads are providing convincing signs of good health in > the trace log. Could you please post backstraces for all available threads? (I hope you don't have 100 of them). > > Definitely the write callback was called tens of thousand of times before > incurring in the deadlock, and I suspect luck or chance has little to do > with the deadlock. It may have more to do with the fact that the read > callback receiving a message send it down a bufferevent with the write > callback handled by another thread? > > The fact is that after adding BEV_OPT_UNLOCK_CALLBACKS the deadlock has not > appeared again. Before it took about one hour of stress test to appear. If > removing the callback lock solved the problem, please help me understand > what was happening in the first place and if for instance the added lock > inside the callback was unnecessary or even the cause of the deadlock > itself. > > Also, since enabling OpenSSL is coming next, I was wondering if you have a > word of wisdom or two, on how to better tackle the task at hands or which > version of libevent is best suited for the job. I am using the head of the > tree, or to do some debugging if that it was it takes, I would be okay with > it. I am afraid that locking the callbacks may be needed with OpenSSL > enabled. I prefer to use the latest versions, to be up-to-date. bufferevent_openssl doesn't require bufferevent locking - look at bufferevent_openssl_new_impl() in bufferevent_openssl.c > > Thanks for the useful insights. > > -Marco G. And for the next time: could you avoid top-posting please. Cheers, Azat. > > On Wed, Feb 11, 2015 at 3:11 PM, Azat Khuzhin <[email protected]> wrote: > > > On Tue, Feb 10, 2015 at 09:06:40PM -0800, Marco Graziano wrote: > > > I am running into a deadlock with both libevent 2.0 and 2.1 that I cannot > > > explain. One thread is executing a write callback and the other thread a > > > read callback of a different bufferevent. Since they are both accessing > > the > > > same data structure passed to the callbacks (arg=0x630c40), I am using a > > > rwlock in the data structure. I did not expect though the evbuffer > > internal > > > lock to interfere with the lock used to synchronize access to the data > > > structure shared between the callbacks. > > > > > > I am including the backtraces and the relevant portions of code. I would > > > appreciate if someone can give me any suggestion on how to work around > > this. > > > > > > Thanks, > > > > > > -Marco G. > > > > > > (gdb) t 1 > > > [Switching to thread 1 (Thread 0x7ffff7fdd780 (LWP 12008))] > > > (gdb) bt > > > #0 __lll_lock_wait () at > > > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > > > #1 0x00007ffff7770672 in _L_lock_953 () from > > > /lib/x86_64-linux-gnu/libpthread.so.0 > > > #2 0x00007ffff77704da in __GI___pthread_mutex_lock (mutex=0x620200) at > > > ../nptl/pthread_mutex_lock.c:114 > > > #3 0x00007ffff7b9686e in evbuffer_get_length (buffer=0x630930) at > > > buffer.c:610 > > > > I don't sure that this two backtraces could have deadlock. Do you? > > Could it call write callback again and again, but you was so lucky that > > you always see such backtrace? > > > > Maybe some other thread acquire some locks in other order? > > (I bet to this one). > > > > Also did you check program under memcheck/drd/helgrind? > > Again if you was so *lucky* that you overwrite lock of evbuffer by your > > own, but it is so unlikely, that you could just ignore this. > > > > And if you could write 50-100L sample to reproduce the issue, it will be > > very helpful. > > > > Cheers, > > Azat. > > > > > #4 0x000000000040494b in device_write_cb (bev=0x123f640, arg=0x630c40) > > at > > > cometa.c:667 > > > #5 0x00007ffff7ba2139 in bufferevent_trigger_nolock_ (options=0, > > iotype=4, > > > bufev=0x123f640) at bufferevent-internal.h:369 > > > #6 bufferevent_writecb (fd=<optimized out>, event=<optimized out>, > > > arg=0x123f640) at bufferevent_sock.c:297 > > > #7 0x00007ffff7ba7a19 in event_persist_closure (ev=<optimized out>, > > > base=0x6218d0) at event.c:1531 > > > #8 event_process_active_single_queue (base=base@entry=0x6218d0, > > > activeq=0x621d20, max_to_process=max_to_process@entry=2147483647, > > > endtime=endtime@entry=0x0) at event.c:1590 > > > #9 0x00007ffff7ba82cf in event_process_active (base=0x6218d0) at > > > event.c:1689 > > > #10 event_base_loop (base=0x6218d0, flags=0) at event.c:1912 > > > #11 0x000000000040a700 in main () at cometa.c:1571 > > > > > > > > > bufferevent 0x123f640 write callback: > > > 665 pthread_rwlock_wrlock(&device->lock); > > > 666 if ((device->pending && device->state == WAITING_STATE) || > > > device->ws_pending) { > > > 667 int l = evbuffer_get_length(device_bev); How device_dev is installed? (evbuffer_new()/evbuffer_enable_locking()) > > > 668 pthread_rwlock_unlock(&device->lock); > > > 669 output = bufferevent_get_output(device->pending->bufev); > > > 670 if (evbuffer_get_length(output) == 0) { > > > 671 /* free and close the connection */ > > > 672 pthread_rwlock_wrlock(&device->lock); > > > 673 free_http_connection(device->pending); > > > 674 device->pending = NULL; > > > 675 device->state = IDLE_STATE; > > > 676 device->chunk_length = 0; > > > 677 device->ntoread = -1; > > > 678 > > > 679 evbuffer_drain (device_bev, l); > > > 680 sem_post(&device->active); > > > 681 pthread_rwlock_unlock(&device->lock); > > > 682 } > > > 683 } else { > > > 684 pthread_rwlock_unlock(&device->lock); > > > 685 } > > > > > > > > > (gdb) t 3 > > > [Switching to thread 3 (Thread 0x7ffff532f700 (LWP 12013))] > > > (gdb) bt > > > #0 pthread_rwlock_wrlock () at > > > ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:85 > > > #1 0x0000000000404667 in device_read_cb (bev=0x630730, arg=0x630c40) at > > > cometa.c:604 > > > #2 0x00007ffff7ba2324 in bufferevent_trigger_nolock_ (options=0, > > iotype=2, > > > bufev=0x630730) at bufferevent-internal.h:366 > > > #3 bufferevent_readcb (fd=34, event=<optimized out>, arg=0x630730) at > > > bufferevent_sock.c:187 > > > #4 0x00007ffff7ba7a19 in event_persist_closure (ev=<optimized out>, > > > base=0x7fffe8000900) at event.c:1531 > > > #5 event_process_active_single_queue (base=base@entry=0x7fffe8000900, > > > activeq=0x7fffe8000d50, > > > max_to_process=max_to_process@entry=2147483647, > > endtime=endtime@entry=0x0) > > > at event.c:1590 > > > #6 0x00007ffff7ba82cf in event_process_active (base=0x7fffe8000900) at > > > event.c:1689 > > > #7 event_base_loop (base=0x7fffe8000900, flags=0) at event.c:1912 > > > #8 0x000000000040a1ff in device_loop (arg=0x0) at cometa.c:1437 > > > #9 0x00007ffff776e182 in start_thread (arg=0x7ffff532f700) at > > > pthread_create.c:312 > > > #10 0x00007ffff689700d in clone () at > > > ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 > > > > > > bufferevent 0x630730 read call back: > > > 604 pthread_rwlock_wrlock(&device->lock); > > > 605 if (device->state != WAITING_STATE) { > > > 606 > > > 607 device->chunk_length = 0; > > > 608 device->ntoread = -1; > > > 609 /* in case of state == WAITING_STATE the counters reset is done > > in > > > the write callback after sending the response */ > > > 610 } > > > 611 pthread_rwlock_unlock(&device->lock); *********************************************************************** To unsubscribe, send an e-mail to [email protected] with unsubscribe libevent-users in the body.
