On Jan 11, 2009, at 1:38 PM, Tycon wrote:

well I'm a performance person and I hate wasting time because of bad/
sloppy design and implementation. And even for low volume apps, a well
tuned app will have a better response time, as well as higher load
capacity and better sclalability.

Mmmm, performance, I like that. :)

In pure request per second I made a bunch of tune-up that resulted in
these improvements:
1. Use python2.6 ==> +5%
2. Use memcached to store sessions ==> +10%

You'll find that pure cookie based sessions are even faster.

3. Make sure session is loaded and saved only once instead of multiple
time during a request ==> 5-10%

Or even more depending on use of save(). I'll get out a new Beaker soon with the improvements you helped with, thanks!

4. Use cherryPy instead of paste HTTP server ==> 5-10%

It's useful to note that they work differently. If you have excess requests than the CherryPy threadpool can handle, they just get dropped.... the Paste HTTP server does a few things the CherryPy one doesn't, like checking for stuck threads, spawning threads optionally if the thread-pool is full, etc. Keeping those differences in mind helps when evaluating performance, the Paste HTTP one can also recycle threads which can help if you have memory leaks in your code (the MySQLdb Python adapter has some leaks annoyingly enough).

Also, I'd recommend trying the Spawning package, you can easy_install spawning, then edit your ini to change to: spawning#main, and set the thread_pool down to 0 to use the async behavior. You can also set num_processes (to however many CPU cores you have), and overall I've found spawning to give superior performance to CherryPy's HTTP server, you'll get higher requests per second AND lower latency for fastest/ longest request, its awesome.

5. Use nginx as a proxy instead of apache => 25% (and save 15MB per
worker)

spawning gives slightly more req/sec with better latency than nginx proxy to several pylons w/cherrypy WSGI server procs.

6. Remove redunandant code from middleware (e.g. cache middleware)

I look forward to fixing that in the next release. :)

7. Use mutiple processes instead of just increasing the number of
threads in a single paster app server => 15-20%

The fact that spawning can do this inside with an async dispatcher really helps in its amazing performance.

8. Cache pages in memcached, and have nginx bypass app server by
fetching directly from memcached (and use ssi to render dynamic
fragments) ==> 150%

With all these small fixes and optimizations I end up with something
that's X-times faster and is noticeable even for low volume apps due
to much better response time improving the user experience.

Not to mention that I keep a scalable upgrade path by making sure each
component can easily be move to a different machine (e.g. => dont use
mod_wsgi or local caches).

Yup, if you find any more issues in Pylons or Pylons dependencies, please let me know, I'd love to increase efficiency.

Cheers,
Ben

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to