On Mar 17, 2010, at 10:44 PM, Graham Dumpleton wrote:
>
> If all the prefork processes are doing is serving static files and
> proxying to daemon mode, this configuration looks way over the top to
> me.
>
> What is your justification for setting MaxClients to such a high value?
>
Mainly to avoid dropping connections under high load - lets just say for now
that we overbought on memory and can afford to use it for stuff like this...
but see my next question.
> For a properly tuned web application, request response times should be
> well under a second. A high value for MaxClients would only be
> justified if you have lots of long running requests or if keep alive
> connections become an issue.
So that's exactly our issue - we have a RESTful API that can take longer - in
fact some of our requests take well over 30 seconds. At the moment the
processes serving these are pretty compute-intensive, meaning they're
CPU-heavy, and thus don't fit the standard profile of mostly-IO-bound. This
means we can potentially have a number of cpu-bound daemons, and under load it
seems to make sense to let apache queue up requests. 600 is high, I agree, but
it all kind of started with "lets try 200 and bump up the load, ooh, that
worked, lets try 400 with a higher load" etc.
Our website is a totally different story, the http requests are generally
pretty fast - and in fact this is why we have multiple mod_wsgi daemon groups -
the website tends to have the patterns you talk about.
> For keep alive connections if they become
> an issue, you would be better off turning keep alive off or putting
> nginx proxy in front. The nginx proxy would also serve to isolate you
> from slow clients as well, further removing the need for a high value
> of MaxClients. Default Apache installations would normally have
> MaxClients more than 150 and that would handle the majority of web
> sites.
Yeah, we have a squid in front of this but squid kind of sucks at dealing with
slow clients, or buffering up connection requests. But your advice is good - we
shouldn't be relying on each application server to buffer up connections, that
should be happening out in front. We're likely going to switch to Varnish, but
probably not immediately...but this is fuel for that fire for us :)
>
> Have you ever used mod_status or other monitoring to find out what the
> maximum number of prefork processes are being created? Are you really
> getting hundreds of processes to justify that excessive value?
>
During load spikes, yes. Again this mostly has to do with us having spikes in
our API traffic.
> Tell me again how many daemon process groups you are creating and how
> many processes/threads are allocated across them.
Four at the moment, but one process group (the API service with the high CPU
load) shoulders 60-70% of the load. (the one that we're having the most
problems with...)
All of them have threads=1, two have processes=10, and two (the two
high-traffic ones) have processes=24
>
> Application groups refer to the Python sub interpreter within a
> process. A value of %{GLOBAL} refers to the main or first interpreter
> created. It has special attributes. If you are only delegating one
> WSGI application to a daemon process, it is safe to force use of main
> interpreter in all daemon processes.
>
ooh interesting. so what we actually do is build seperate virtualenvs, (all
originally built with the same Python binary, don't worry) each with their own
set of site-packages + PYTHONPATH. We want clear sandboxing between the
different apps, so that one application might have FormEncode 1.0 installed in
its site-packages, and another might have FormEncode 1.1.
Should we be designating a specific WSGIApplicationGroup then for each
application?
Alec
--
You received this message because you are subscribed to the Google Groups
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/modwsgi?hl=en.