Re: [modwsgi] Response time scaling with # of concurrent requests

Graham Dumpleton Sat, 21 Jan 2012 04:27:20 -0800

For a start, since you are using mod_wsgi daemon mode, ensure
mod_headers is enabled in Apache and then add:


  RequestHeader add X-Queue-Start "%t"

That will give you the queuing time on the overview chart.

I'll comment more later when have a chance to look over email and data.

Graham

On 21 January 2012 09:07, Daniel Benamy <[email protected]> wrote:
> On Thu, Jan 19, 2012 at 8:55 PM, Graham Dumpleton
> <[email protected]> wrote:
>> It isn't out of the ordinary for response times to go up as number of
>> concurrent requests is increased. It is the sum total of the
>
> Ok. It's good to know that the basic phenomenon is normal. Does it
> make sense that the hello world django view would have response times
> that can be very closely modeled with this equation:
> avg time = concurrent reqs x 1 ms + 1ms
> Specifically, I would have thought that since I have 2 cores on this
> vm, 2 concurrent requests wouldn't go so much slower than 1 request at
> a time.
>
> I'm seeing scaling problems to a much greater degree on our production
> systems, but there it was and may still be because we're actually
> maxing out the cpu. To give a little more info about the production
> system, we've got varnish sitting in front of the app servers and it's
> sometimes returning 503s. I'm trying to track down exactly what's
> going on. We had very high cpu usage on the app servers which I
> improved by changing the app to make fewer and bigger db queries. We
> also limited the number of connections varnish will make to an app
> server which may have helped. Now cpu usage is usually ok but
> responses will still occasionally take a long time.
>
> To be clear, I've got a few different environments/tests here:
> 1. the static, wsgi, django benchmark [1],
> 2. the static file benchmark which includes throughput [2], and
> 3. our production systems.
>
> [1] 
> https://docs.google.com/spreadsheet/pub?hl=en_US&hl=en_US&key=0AurdDQB5QBe7dGFncnlUNkdKMVJ4NnYtRjhjaGFIeFE&output=html
> [2] 
> http://serverfault.com/questions/344788/why-is-static-page-response-time-going-up-with-increased-concurrent-requests,
> https://docs.google.com/spreadsheet/pub?hl=en_US&hl=en_US&key=0AurdDQB5QBe7dGtiLUc1SWdOeWQ4dGo3VDI5Yk8zbWc&output=html
>
> I thought I'd see if a minimal test case could pinpoint a bottleneck
> that might be contributing, but maybe I'm barking up the wrong tree
> and I should dig into the details of the production systems with new
> relic instead of focusing on the minimal benchmarks.
>
>> What you aren't graphing is throughput, so relying on response times
>> alone is deceiving.
>
> That makes sense. I'll keep it in mind.
>
>> https://plus.google.com/114657481176404420131/posts/G1jM6WW3Pnu
>
> Cool post, thanks!
>
>> Whether one configuration is better than another will depend on your
>> specific application and whether they are cpu intensive tasks for i/o
>> bound tasks.
>
>> At that point for your application you look at changing processes vs
>> threads at server level to make it work more efficiently, but usually
>> more importantly tuning your application to cut down application and
>> database bottlenecks. Get rid of the worst and you will bring down
>> response times that way as well.
>
> In our production app, we seem to be using quite a bit of CPU and
> sometime were (are?) bottlenecking on it. This would lead me to think
> that we'd want to have more mod_wsgi processes with fewer threads, but
> then we need more memory and we're already up against what our current
> vms have. Does new relic have a report on what's using memory by any
> chance? :-)
>
> Speaking of application and database bottlenecks (*hijacks discussion
> in a new direction* :-)), it seems that making db queries is using a
> lot of cpu! I drastically improved our cpu usage by making our app do
> more joins so it could get data from the db in fewer queries. I've got
> johnny-cache set up so queries involve some extra work, but I wouldn't
> expect them to use so much cpu when the memcached servers are on
> separate machines. Does anything come to mind about why this would be
> happening?
>
>> In those graphs you will see a few things which exploring to help tune
>> those things. These are thread utilisation and queuing time.
>
> Very cool. On the production site, since we've got varnish sitting in
> front of the app servers and limiting itself to 10 connections per app
> server, we definitely should have enough apache workers. It is
> possible that we're running out of mod_wsgi daemon workers under
> certain circumstances (we have many sites in separate mod_wsgi groups
> or whatever they're called, with 5 threads each).
>
>> If you are happy to try mod_wsgi 4.0, you can get access to both
>> queuing time and also thread utilisation. It will all change at some
>> point, but the current New Relic Python agent is able to grab the data
>> and you can graph it with custom views to get those charts I link to.
>
> Interesting. I'd like to pursue the performance problems a little
> further with mod_wsgi 3. If I don't get anywhere, I'll speak with my
> team about trying mod_wsgi 4.
>
>> As to your New Relic account, I can't see to find it by searching for
>> email or name. You might let me know the account number that shows in
>> URL so can look at what data you are getting and if you try mod_wsgi
>> 4.0, can give you the custom view definition you can setup to get that
>> chart.
>
> The account number is 75245. It has an application for our production
> setup. Thanks for taking a look!
>
>> BTW, don't entirely focus on the application side either. Although you
>> may get your application performance times down to 100ms levels, the
>> users will not care much if they are still seeing 6 second page load
>> times because of page rendering times. Better user satisfaction can
>> therefore often be more quickly by improving the HTML/JavaScript sent
>> back to the browser.
>
> That makes a lot of sense.
>
> Thank you so much!
> Dan
>
> --
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/modwsgi?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Re: [modwsgi] Response time scaling with # of concurrent requests

Reply via email to