On Nov 16, 5:15 pm, Graham Dumpleton <[email protected]> wrote:
> > On the system that was exhibiting the problem, I moved the daemon > > count from 4 to 8, and threads from 5 to 8, and bumped up the max > > requests. This appears to have mitigated the problem, but as it > > appears for now to be a bug either within mod_wsgi or apache I'm happy > > to help any way I can in tracking it down. > > Are you in a position to be able to upgrade versions of Apache and mod_wsgi? > > Are you willing to run mod_wsgi from subversion trunk? The subversion > version has additional logging to help gather more information abbot > this issue. Upgraded mod_wsgi I can do. However, since I have the aforementioned changes the web server cluster has run with 100% uptime for 5 days, when previously we averaged one failure per day (across a 3 machine farm). Whatever the problem is, having more daemon processes per group seems to mitigte it. > Can you try and get stack traces of stuck daemon process group using > gdb script recipe right at end of: > > http://code.google.com/p/modwsgi/wiki/DebuggingTechniques > > Please also post whether using prefork Apache MPM, plus mod_wsgi > daemon process and related directives from mod_wsgi configuration. We use prefork (havn't seen the need to switch) and run 43 vhosts each which have the following configuration: WSGIDaemonProcess <name> user=apache group=apache display-name=<name> processes=4 threads=5 maximum-request s=400 Then vhosts then scriptalias the vhost root to a wsgi handler. We isolated the failures to one specific vhost which has higher traffic than the others, and it's been changed to 8,8,1024 instead of 4,5,400 > I worked with that OP a fair bit off list to help sort this out, but > put it aside at the moment as busy in last few weeks of a job before I > leave. They weren't able to move to latest mod_wsgi source so can get > debug output from extra code I added. We have our web servers behind a layer 7 switch, and we've already built the latest mod_wsgi. I'll keep in touch around the failures and once it trips again, I'll upgrade mod_wsgi to get you the logging you need. a stacktrace on the other hand is harder. I'd really need to bring an additional machine into the farm for that, as the chance of a node failure would seem to be larger. I can probably spare the staff resources to make that happen over the christmas period should this issue remain unresolved by then. Cheers, Matt -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
