On Tue, Feb 19, 2013 at 11:49 AM, Nick Mathewson <ni...@freehaven.net> wrote: > Could you be more specific about the good results, and how to > reproduce them? Or how to reproduce/observe the undesired behavior? > It would also be good to have an explanation of how the patch is > supposed to work. I had a lot of trouble reproducing this bug. It is exacerbated by high requests parallelism, where there is a high chance of having fds numbers get reused by the kernel and the evmap falls out of sync with the kernel's idea of which fds are and aren't used.
I think the trigger is a connection that times out. Then evhttp_connection_reset_ closes the fd (evutil_closesocket) but the bufferevent still thinks it owns the fd and the fd is still present in the underlying evmap/epoll implementation. Then the kernel hands the same fd to a new bufferevent and that bufferevent starts getting events for the old one. This would cause assertions in the epoll layer like "Epoll ADD(X) on fd Y failed", Valgrind invalid reads or invalid writes and occasionally crashes. I'm sorry I can't be more specific than this. The way I tackled this was by adding a ton of cross-layer assertions like ("fd returned from accept is not present in evmap") or "fd passed to evutil_closesocket must have no remaining event interests attached to it" and gradually tracing back to who was violating these assumptions. We saw a 7-fold reduction in crash rate with this patch. *********************************************************************** To unsubscribe, send an e-mail to majord...@freehaven.net with unsubscribe libevent-users in the body.