[modwsgi] Re: apache configuration for little server

Graham Dumpleton Sat, 29 Nov 2008 23:05:46 -0800

2008/11/29 William Dode <[EMAIL PROTECTED]>:
>
> On 28-11-2008, Graham Dumpleton wrote:
>>
>> 2008/11/29 William Dode <[EMAIL PROTECTED]>:
>>>
>>> Hi,
>>>
>>> i've a little virtual server with few websites with very low traffics
>>> and one website with more traffic (100000 hits/day). Most of the time it
>>> works very very well, very fast, without using too much memory...
>>>
>>> But sometimes the load goes very high, i don't know why, i imagine that
>>> it's because of somes web spiders or attack or a bug in my application.
>>>
>>> I would like somes tips to configure apache to can handle this problem.
>>>
>>> I use apache-worker and mod_wsgi in deamon mode.
>>>
>>> For the mod_wsgi app i use threads=1 (my app is not thread safe) and max
>>> request=1000 (for memory leak)
>>
>> Can you post all the WSGI related directives you have in your Apache
>> configuration so can see exactly what you have an whether it is
>> correct.
>
>       WSGIDaemonProcess seps.flibuste.net user=seps group=www-user 
> stack-size=524288 python-path=/home/web/seps/pynclude:/home/web/seps/seps.
> flibuste.net/pynclude home=/home/web/seps threads=1 inactivity-timeout=600 
> maximum-requests=1000 display-name=wsgi-seps.flibuste.net
>        WSGIProcessGroup seps.flibuste.net
>        AddHandler wsgi-script .wsgi
>
>>
>>> for apache i did this : BUT DON'T THINK IT'S CORRECT
>>
>> For your size site (approx 1-2 request/sec average) you could have
>> just left default Apache worker configuration as is and the site would
>> probably have worked fine. I wish people would stop trying to
>> prematurely optimise performance when it isn't necessary, especially
>> when they don't really understand Apache configuration and are likely
>> to do more harm than good. :-(
>
> I would not touch the configuration if i had not the problem before of
> course !
>
>>
>> The thing that is more likely to kill you is your application not
>> being thread safe, although depends on how you have setup WSGI
>> directives, which you didn't supply. So, post the WSGI bits of the
>> configuration and would then be able to comment more.
>
>
>>
>> Also indicate what the WSGI application is written in. Ie., which framework.
>
> My own framework, very light and fast, i use now webob, postgresql with
> no orm, caching a lot...
> This application was working fine for years, the problem came after an
> upgrade to lenny (apache2-modwsgi), before i used
> apache1+mod_proxy+twistedmatrix. I measure each page on the app side,
> the average is 0.04s it's why i didn't bother to make it thread safe.


What proportion of your overall requests are handled dynamically by
your WSGI applications as opposed to how many are static file
requests?

The problem is that if a large portion of them are dynamic requests,
or the URLs are such that search engines or SPAM bots might readily
trigger them, because you have a single daemon process with only a
single thread, it can only handle one request at a time. Even if it
was a surge in traffic, you should have been able to tell that from
the Apache access logs unless you have disabled them doing logging.

Anyway, it doesn't matter how many processes/threads you configure for
main Apache child worker processes, any dynamic requests will stall if
a dynamic request before them takes a long time. In other words, you
have created a potential bottleneck in the way you have configured the
daemon process group.

At the minimum, you should create multiple daemon processes to handle
requests if they are going to be single threaded. Thus:

WSGIDaemonProcess seps.flibuste.net ... processes=5 threads=1 ...

A further issue is, since you are using AddHandler instead of
WSGIScriptAlias, how many different .wsgi scripts do you have. If you
have numerous scripts, you need to be aware that each will be run in
the context of its own sub interpreter within the daemon process. The
sub interpreters persist between requests for the same WSGI script,
but by having a sub interpreter per WSGI script in this case, if you
have a lot of scripts, you could be using up a lot more memory than
you need, and if the startup costs are high because of what code has
to be loaded, causing yourself slow downs while processes handle
initial requests after a process restart due to having to load same
code modules multiple times.

If you have multiple WSGI scripts in your sort of situation and they
can actually coexist in the same single sub interpreter, you would be
better off running them in one interpreter rather than many. To force
them all to run in same interpreter set:

  WSGIApplication %{GLOBAL}

In this case we are actually running them in main interpreter.

As to the load of box going up, did you actually identify using 'ps'
that it was the Apache child worker processes or mod_wsgi daemon
processes that was consuming the CPU? If you could clearly see this,
then likely it is your application code itself if a mod_wsgi daemon
process. As to Apache child worker processes having high load, that
wouldn't be something I would expect unless running other no static
handlers in Apache for handling requests.

The only real situation I could see where something related to
mod_wsgi could be cause of load suddenly going up high, is if your
application ran code which caused the process to crash. This would
cause mod_wsgi to immediately start another process in its place. If
something was continually firing requests at same URL in very rapid
fashion, and it crashed every time, you might see some load increase,
but even then wouldn't expect it to be that noticeable unless your
startup cost of application is very high.

Only time would expect to see it be really noticeable is you were
preloading code into mod_wsgi daemon processes and they were crashing
at startup, in which case it would go into tight loop restarting
process until Apache shutdown. This would be more problematic because
not dependent on a subsequent request to come in to load code again.

If processes were crashing and causing either of these scenarios,
there should be some information in the Apache error logs about it. To
ensure there is you should use:

  LogLevel debug

in main Apache configuration, and if necessary any VirtualHost where
LogLevel may also be set.

Overall, I can't see a lot in the information you have provided to
base any decision to change the Apache configuration. Seems you are
more just trying things rather than actually knowing what the true
trigger of the problems may have been.

I wouldn't see something like KeepAlive making much difference if
using worker MPM, because even if sockets were all held, for worker
MPM the maximum number of Apache child worker processes is usually
small. This is in contrast to prefork where it can be a 100. In the
case of prefork, if running out of sockets and having to create lots
of Apache child worker processes, would still only really be an issue
if those Apache child worker processes were having to perform
something significant at startup or no first request. For example, if
WSGI applications weren't actually running in mod_wsgi daemon process
but in embedded mode and so every Apache child worker processes was
having to load up your application as well, if request hitting that
new process targets it.

If you have some better information on the actual cause from
information in access/error logs etc, or from what 'ps' was showing it
would help as far as working out real problem and thus what
configuration changes you might need, but otherwise can only suggest
the above as far as mod_wsgi goes and how to it at least improve
performance on that side of things.

Graham

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en
-~----------~----~----~----~------~----~------~--~---

[modwsgi] Re: apache configuration for little server

Reply via email to