Re: Recommended production deployment

Tycon Tue, 13 Jan 2009 20:10:36 -0800

Actually I have apache+modwsgi running flawlessly, and everything I
said is based on meticulous performance benchmarking and theoretical
profiling of deployment architectures.


All benchmarks were performed on pylons full stack production mode,
with debugging and logging turned off. The test was a simple "hello
world" page with no template rendering, database access or other
external links or references. Apache is prefork MPM v2.2.8 with
modwsgi 1.3 using python 2.5.2.

The benchmarks CLEARLY show that using a stand-alone app server is
MUCH faster then using apache+modwsgi to serve a page (returned from
the aforementioned "hello world" controller action). When using
CherryPy as the HTTP server for pylons, the req/sec is almost twice as
fast as apache+modwsgi. PasteHTTP is 15% slower than CherryPy but
still much faster then Apache+modwsgi.

But of course using a stand-alone app-server as the web-server has a
few drawbacks:

1. Static files are also served by pylons, which is slow.
2. Cannot use multiple processes, which is required to make optimal
use of system resources and allow for scalability across multiple CPUs
and multiple machines.

So in most cases you will want to have a reverse proxy front-end that
acts as a load balancer as well as serve the static files. As you
mention, nginx is a good choice. In this configuration, I would assume
it would be better to have nginx forward the requests directly to a
pool of standalone pylons app-servers, rather than to a pool of apache
+modwsgi processes.

So, while using only standalone CherryPy to serve a pylons app has
some drawbacks, those drawbacks are avoided if using reverse proxy
that serves as a load balancer and to serve static files.

So when is it best to use apache+modwsgi ? If we have a single node
with mutiple cores and a lot of RAM, and we need to serve a lot of
static files, then apache+modwsgi would be better than using a
standalone app server. But even for this scenario, nginx as a reverse
proxy gives about the same performance.

On Jan 13, 5:53 pm, Graham Dumpleton <[email protected]>
wrote:
> On Jan 14, 10:42 am, "Mike Orr" <[email protected]> wrote:
>
>
>
> > On Tue, Jan 13, 2009 at 6:52 AM, Tycon <[email protected]> wrote:
> > > Last word on modwsgi and its "daemon" mode, which is similar to
> > > reverse proxy and fcgi in that it separates the web server and app
> > > server. As such, it has the same theoretical performance as reverse
> > > proxy and fcgi (which in fact provide the same performance), but it
> > > uses a proprietary communication protocol, and inlike proxy or fcgi,
> > > it requires the app and web server processes to be on the same machine
>
> > Is *that* what you're talking about when you say "daemon mode" and
> > "proprietary protocol".  I thought you meant daemon mode as in running
> > PasteHTTPServer or CherryPy as a daemon, and proprietary protocol as
> > in WSGI or SCGI.
>
> > The main point of mod_wsgi's daemon mode is to isolate bugs/memory
> > leaks between the web application and the server, and to track the
> > application's individual resource usage in the 'ps' listing.  It's not
> > designed for multi-machine scalability.
>
> > As for its "proprietary" protocol, I consider that an internal matter
> > of mod_wsgi.  What matters is whether it works, and I haven't heard
> > any complaints in that regard.
>
> > Ultimately it comes down to the sysadmin's time of setting up mod_wsgi
> > now and possibly switching to something else later, vs setting up
> > something multi-machine scalable now (which is more work up front).
> > And that depends on how likely a traffic onslaught is, how quickly the
> > load will accelerate, and the sysadmin's future availability.
>
> You don't need to switch to something else when you want to go to a
> multi machine configuration. This is just part of the FUD that they
> were pushing on the irc channels in the past to try and discredit
> Apache/mod_wsgi. It just seems that Tycon is parroting this same
> message. I wouldn't be surprised if he has never even used Apache/
> mod_wsgi. Certainly the originator of a lot of this FUD on the irc
> channels in the past freely admitted he had installed neither
> mod_python or mod_wsgi with Apache.
>
> When you want to start looking at horizontal scaling you do exactly
> what you would do were you scaling up Apache for any other scenario.
> That is, you stick perbal, pound, nginx or some other proxying/load
> balancing solution in front and run an Apache instance of each of the
> machines in your cluster.
>
> Since running nginx in front of Apache/mod_wsgi to handle static files
> is a common scenario, that same nginx instance could be used for the
> job, given that it still likely going to handle the load of both
> static files and proxying fine because of being event driven rather
> than threaded.
>
> Hmmm, this sounds exactly like what is because suggested if using
> paste server or cherrypy wsgi server. Strange that.
>
> When you read this persons other posts I think it is quite clear that
> he doesn't really understand the difference between the WSGI
> specification and mod_wsgi as an implementation of it. Specifically
> that WSGI is a programmatic API and not a wire protocol. It is also
> questionable how much he knows how to setup Apache and mod_wsgi.
> Probably will never see anything about how they actually configured
> Apache and mod_wsgi for these benchmarks that he surely must have run
> to come to these conclusions. Like most benchmarks they are probably
> flawed due to not setting up his tests properly so equal comparison
> was being performed, or not even benchmarking something realistic.
>
> What I find particularly clueless about comparisons between different
> hosting mechanisms for dynamic web applications is that they quite
> often test a hello world application and not a real application. As
> such any figures are pretty meaningless given that any difference
> between different hosting mechanism is likely in milliseconds. When
> for a large application the overall request time is in the 10-100
> milliseconds, the difference just disappears as noise within the real
> bottleneck which is the application and or database access. Add on top
> of that that you would never want to run your web server at maximum
> capacity on an ongoing basis, you would generally have more than
> enough headroom even if different hosting solutions perform a bit
> different, thus differences don't matter anyway.
>
> Overall one would be much better off focusing your time on improving
> the performance of your application and database access than having a
> pissing contest about raw request throughput for some unrealistic
> pattern of traffic that your site will never experience.
>
> Graham
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Recommended production deployment

Reply via email to