2009/1/5 Graham Dumpleton <[email protected]>: > 2009/1/3 William Dode <[email protected]>: >> >> On 03-01-2009, Graham Dumpleton wrote: >>>> I meant that my app is now thread safe, so i configured the daemon >>>> directive to have 15 threads and now i would like to know how many >>>> simulnateous threads was actualy used. >>> >>> Not following exactly what you mean. Are you wanting to know whether >>> when subjected to specific traffic, eg. benchmarking, how many >>> concurrent requests were handled at any one time and thus how many of >>> those 15 threads were actually required in worst case situation? >> >> Yes, exactly that. > > I'm not ignoring this, just been too busy to sit down and write some > code to capture such information. Want to add the code to the > debugging page on wiki as could actually be useful in trying to work > out what configuration one should use.
Out of curiosity, what WSGI application are you running? The more I thought about how to go about writing some generic code to capture information about how many of the configured threads may be used at any one time, the more I realised it really needs to take into consideration other stuff as well and so it is a non trivial task. This was reinforced for me in recent blog at: http://collingrady.wordpress.com/2009/01/06/mod_python-versus-mod_wsgi/ by someone who has recently moved from mod_python to mod_wsgi. The drops in memory and load here can be attributed to moving from mod_python running in an embedded mode to using mod_wsgi in daemon mode. That is, less processes with a full copy of the WSGI application. When you look at the graphs and when mod_python was running, the peaks in memory usage are when Apache decided that it was getting more requests than it could handle and so it created more child worker processes to handle them. The act of creating the additional processes and having to load their application afresh, is also why the load jumped up at the same time. That daemon mode in mod_wsgi actually uses a fixed number of processes means you wouldn't get these spikes, but does also mean you have to have enough processes to begin with to handle the expected load. So, all in all nothing out of the ordinary here. What surprised me though when I queried them as to exact daemon mode configuration that they were running, was that only a single daemon process was used and that was running only a single thread. If this was some small WSGI application then it wouldn't be a big deal, but they were running Django. They were also more than happy with the performance they were seeing and how it was handling loads as it was much better than what mod_python was doing before. Of note in the comments is where it says: """Well the busy times that used to cripple the site with 10+ load and made it unreachable now only resulted in a slightly slower page loading time - 3s by me just counting it""". Knowing this persons reputation in the Django community, what in practice is probably happening is that they have optimised their requests so well that even when they are under load and with all requests being serialised, the worst they are seeing is 3 seconds. The extra long response time here not being because the request is taking any longer in the application, but because of the backlog of queued requests which is occurring when there is a burst of traffic. After my quick discussion with him, he is actually going to try with 3 processes and so how it then handles the load. By running 3 processes in the daemon process group, obviously the amount of memory used by the WSGI application will be roughly 3 times as much, but one should expect to see that request time come down from 3 seconds as requests will no longer be serialised. Now, if the specific application he was running wasn't uniformly handling all requests very quickly and some URLs took a lot longer than others, having requests be serialised would have been really bad. This is because once a request comes in for a URL which is slow to return a response, all the other requests start backing up even quicker. So, it isn't enough to just look at how many threads are utilised, you really need to also get a profile across your whole application as far as how long each URL takes to respond. This is because how many threads and/or processes you need is going to be more dictated by how many of the longer requests you expect to get through in a certain time. Now I am presuming here that someone has probably already written some benchmarking tool to work out just that. At worst one could use siege to issue requests across a whole selection of URLs for your site and use a little WSGI middleware to try and capture both thread utilisation and length of requests. The only problem with it is you are going to have a lot of raw data, so one needs a good way of summarising it and presenting it. At this point I am hoping that others will jump in with names of tools which might deal with just this issue. The moral of the story is that if you can properly optimise your actual application so requests are always very quick, you can probably go quite a long way with very minimal number of process (or even one) and a few threads. So, perhaps don't over estimate on what you need, better to keep things small, especially if memory is an issue because you are using a VPS, and concentrate on optimising the time taken for requests. Graham --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "modwsgi" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~----------~----~----~----~------~----~------~--~---
