https://bugzilla.novell.com/show_bug.cgi?id=690474
https://bugzilla.novell.com/show_bug.cgi?id=690474#c0 Summary: mod_mono autorestart under load drops connections and may corrupt the command stream Classification: Mono Product: Mono: Runtime Version: 2.10.x Platform: x86-64 OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: misc AssignedTo: mono-bugs@lists.ximian.com ReportedBy: ben.l...@nearmap.com QAContact: mono-bugs@lists.ximian.com Found By: --- Blocker: --- Created an attachment (id=427000) --> (http://bugzilla.novell.com/attachment.cgi?id=427000) Contains the aspx page and associated code User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.45 Safari/534.13 Ubuntu 9.04, Apache 2.2.11-2ubuntu2.3, mod_mono compiled from 2.10.1 source. Running a site with the following mod_mono setup: MonoAutoRestartMode appsite Requests MonoAutoRestartRequests appsite 10000 MonoMaxActiveRequests appsite 32 MonoMaxWaitingRequests appsite 1024 The mod-mono executable is /usr/local/mono-2.10.1/bin/mod-mono-server4 Created a minimal site containing a single aspx page with code (see attachments). All the page does is return a fixed string. Using an inhouse load-generator, we caused 128 workers to repeatedly load /alive.aspx as fast as possible (so there would be a maximum of 128 requests in parallel at any one time, and a maximum of 128 requests queued if the mod-mono-server is busy). The load generator typically runs at around 500 requests/second when exercising the mono back-end. Every 10000 requests, the mod-mono-server is restarted. Reproducible: Sometimes Steps to Reproduce: 1. See details Actual Results: We see the following behaviour: 1. In the Apache error log: [Thu Apr 28 15:07:22 2011] [error] (70014)End of file found: read_data failed [Thu Apr 28 15:07:22 2011] [error] (70014)End of file found: read_data failed [Thu Apr 28 15:07:22 2011] [error] Command stream corrupted, last command was 1 [Thu Apr 28 15:07:22 2011] [error] (70014)End of file found: read_data failed [Thu Apr 28 15:07:22 2011] [error] (70014)End of file found: read_data failed (many instances of this). 2. In the load-tester clients, at the point that the mod-mono-server is restarted, we see many connection refused errors and 500 server errors. We believe that these are due to the restart since they happen simultaneously with the errors shown above. 3. The above does not always happen. For example, in one test run, when 10000 requests were reached, all requests waited until the restart and then continued; though the responses were delayed, no errors occurred. We therefore suspect this is a race condition that occurs when a certain pattern of parallel requests is handled. Expected Results: We would expect that when the mod-mono-server restarts, pending requests are queued until it is available again. At present, we believe that the auto restart feature is necessary because (a) there are slow memory leaks in the mod-mono-server process (probably due to base heap fragmentation) and there are a number of conditions which cause the mod-mono-server to consume 100% CPU*. However, unless the server can restart cleanly, it is not usable in production under load. * If we can get a test case for any of these, we'll submit bug reports. The bug relating to caching, fixed in 2.10.2 is not the cause here, since we have worked around that (and also reported that bug). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug. You are the assignee for the bug. _______________________________________________ mono-bugs maillist - mono-bugs@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-bugs