Johan, On Sat, Apr 14, 2012 at 11:42 PM, Johan Corveleyn <jcor...@gmail.com> wrote:
> On Wed, Apr 11, 2012 at 1:43 PM, Johan Corveleyn <jcor...@gmail.com> > wrote: > > On Wed, Apr 11, 2012 at 12:28 PM, Philip Martin > > <philip.mar...@wandisco.com> wrote: > >> Johan Corveleyn <jcor...@gmail.com> writes: > >> > >>> I don't know what Surf-Shield does. Its description says: "Can detect > >>> exploit sites and other complex online threats". There is some more > >>> explanation on the AVG website, but it's still pretty vague [2]. Maybe > >>> it does some throttling of requests/responses, inspecting things or > >>> so, ... but whatever it does, svn+serf should probably not crash or > >>> hang. > >> > >> You could compare the apache logs with/without Surf-Shield. > >> You could > >> capture the traffic with/without Surf-Shield and compare. > > Ok, I picked the first failing test, authz_tests.py#4, and executed > that with and without Surf-Shield. Please find in attachment two zip > files of those two runs, containing Apache logs and a wire capture, as > well as the crash dump file. > > I don't see a difference in the Apache logs (they are identical, > except that the crashing one stops earlier). The wire capture ... I'm > not sure. The one from the crash is obviously smaller. But when I > "follow TCP stream" they both seem identical (same number of bytes and > all), and when I then filter out the followed stream, nothing remains. > So I'm not sure where the difference is ... > > I'm hoping someone can take it from here. I'm not familiar with this > part of the code. Maybe the best place to start digging is the crash > dump (and/or a more thorough analysis of both wire captures). If I > hear nothing in the next couple of days, I'll put this into the issue > tracker so it isn't forgotten. > > The dump & logs don't show the issue, only debugging helps in this case. The problem is in ra_serf/util.c svn_ra_serf__handle_xml_parser: if (sl.code == 404 && ctx->ignore_errors == FALSE) { add_done_item(ctx); err = svn_ra_serf__handle_server_error(request, response, pool); SVN_ERR(svn_error_compose_create( svn_ra_serf__handle_discard_body(request, response, NULL, pool), err)); When the response status of a PROPFIND request is 404, you see that the response body is discarded with calls to svn_ra_serf__handle_server_error and svn_ra_serf__handle_server_error. In your particular scenario, the status line of the response is already received, but the body is not. Reading from the response buckets returns EAGAIN status. Problem: the add_done_item(ctx) line ensures that the request is considered as handled, while the response body is still waiting on the socket to be read. ra_serf will only run the serf loop again with the next request. If the connection is not closed directly, which here it isn't, the next request will have a response that doesn't match. The fix is to ensure that the request is only marked as handled when a. the response body has been discarded completely or a b. read error was encountered resulting in serf setting up a new connection. I don't have a tested solution, as my Windows vm was so nice to reboot to install some updates while I was in the middle of a debug session, and I don't have time now to start over. hth, Lieven