[modwsgi] Re: How many threads ?

Graham Dumpleton Wed, 07 Jan 2009 20:09:38 -0800

2009/1/5 Graham Dumpleton <[email protected]>:
> 2009/1/3 William Dode <[email protected]>:
>>
>> On 03-01-2009, Graham Dumpleton wrote:
>>>> I meant that my app is now thread safe, so i configured the daemon
>>>> directive to have 15 threads and now i would like to know how many
>>>> simulnateous threads was actualy used.
>>>
>>> Not following exactly what you mean. Are you wanting to know whether
>>> when subjected to specific traffic, eg. benchmarking, how many
>>> concurrent requests were handled at any one time and thus how many of
>>> those 15 threads were actually required in worst case situation?
>>
>> Yes, exactly that.
>
> I'm not ignoring this, just been too busy to sit down and write some
> code to capture such information. Want to add the code to the
> debugging page on wiki as could actually be useful in trying to work
> out what configuration one should use.


Out of curiosity, what WSGI application are you running?

The more I thought about how to go about writing some generic code to
capture information about how many of the configured threads may be
used at any one time, the more I realised it really needs to take into
consideration other stuff as well and so it is a non trivial task.

This was reinforced for me in recent blog at:

  http://collingrady.wordpress.com/2009/01/06/mod_python-versus-mod_wsgi/

by someone who has recently moved from mod_python to mod_wsgi.

The drops in memory and load here can be attributed to moving from
mod_python running in an embedded mode to using mod_wsgi in daemon
mode. That is, less processes with a full copy of the WSGI
application. When you look at the graphs and when mod_python was
running, the peaks in memory usage are when Apache decided that it was
getting more requests than it could handle and so it created more
child worker processes to handle them. The act of creating the
additional processes and having to load their application afresh, is
also why the load jumped up at the same time. That daemon mode in
mod_wsgi actually uses a fixed number of processes means you wouldn't
get these spikes, but does also mean you have to have enough processes
to begin with to handle the expected load. So, all in all nothing out
of the ordinary here.

What surprised me though when I queried them as to exact daemon mode
configuration that they were running, was that only a single daemon
process was used and that was running only a single thread. If this
was some small WSGI application then it wouldn't be a big deal, but
they were running Django. They were also more than happy with the
performance they were seeing and how it was handling loads as it was
much better than what mod_python was doing before.

Of note in the comments is where it says:

"""Well the busy times that used to cripple the site with 10+ load and
made it unreachable now only resulted in a slightly slower page
loading time - 3s by me just counting it""".

Knowing this persons reputation in the Django community, what in
practice is probably happening is that they have optimised their
requests so well that even when they are under load and with all
requests being serialised, the worst they are seeing is 3 seconds. The
extra long response time here not being because the request is taking
any longer in the application, but because of the backlog of queued
requests which is occurring when there is a burst of traffic.

After my quick discussion with him, he is actually going to try with 3
processes and so how it then handles the load.

By running 3 processes in the daemon process group, obviously the
amount of memory used by the WSGI application will be roughly 3 times
as much, but one should expect to see that request time come down from
3 seconds as requests will no longer be serialised.

Now, if the specific application he was running wasn't uniformly
handling all requests very quickly and some URLs took a lot longer
than others, having requests be serialised would have been really bad.
This is because once a request comes in for a URL which is slow to
return a response, all the other requests start backing up even
quicker.

So, it isn't enough to just look at how many threads are utilised, you
really need to also get a profile across your whole application as far
as how long each URL takes to respond. This is because how many
threads and/or processes you need is going to be more dictated by how
many of the longer requests you expect to get through in a certain
time.

Now I am presuming here that someone has probably already written some
benchmarking tool to work out just that. At worst one could use siege
to issue requests across a whole selection of URLs for your site and
use a little WSGI middleware to try and capture both thread
utilisation and length of requests. The only problem with it is you
are going to have a lot of raw data, so one needs a good way of
summarising it and presenting it.

At this point I am hoping that others will jump in with names of tools
which might deal with just this issue.

The moral of the story is that if you can properly optimise your
actual application so requests are always very quick, you can probably
go quite a long way with very minimal number of process (or even one)
and a few threads. So, perhaps don't over estimate on what you need,
better to keep things small, especially if memory is an issue because
you are using a VPS, and concentrate on optimising the time taken for
requests.

Graham

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en
-~----------~----~----~----~------~----~------~--~---

[modwsgi] Re: How many threads ?

Reply via email to