Re: Occasional PostgreSQL Error

Mark Green Wed, 30 Jan 2008 18:56:11 -0800

On Wed, 2008-01-30 at 19:34 -0600, James Bennett wrote:
> On Jan 30, 2008 6:01 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> > Ahem, there's a huge difference between being confronted with
> > a spinner/progress bar or an error page. The former speaks
> > "Please wait", the latter speaks "Try again".
> 
> OK, so let's break this down.


Yay, thanks for that exhaustive response. :-)
I guess we'll eventually have to agree on disagreement but
I'll add my counterpoints for completeness.

> There are two potential cases where you run up against your database's
> concurrent connection limit:
> 
> 1. Your average traffic level involves more concurrent connections
> than your database permits.
> 2. Your average traffic level is within the number of concurrent
> connections your database permits, but you are experiencing a
> temporary spike above that level.
> 
> In case (1), the odds of a timeout/retry scheme yielding success
> within the time the average user is prepared to wait are low; your
> database is simply swamped, and the only options are to increase
> available resources (in the form of database connections) or refuse to
> serve some number of requests.
> 
> So case (2) is the only one worth discussing as a target for a
> timeout/retry scheme.

Yes.

> In this case, you are (apparently) asking for Django to provide three things:
> 
> 1. A mechanism for specifying a set number of database connections
> which Django will persistently maintain across request/response
> boundaries.
> 2. A mechanism for specifying how long Django should wait, when
> attempting to obtain a connection, before timing out and concluding
> that no connection will be obtained.
> 3. A mechanism for specifying how many times, after failing to obtain
> a connection, Django should re-try the attempt.

Yes. Basically a bog-standard connection pool.

> How, then, would we apply these hypothetical configuration directives
> to the situation of a traffic spike? There are two possibilities:
> 
> 1. Set these directives in advance and have them be a permanent part
> of the running application's configuration.
> 2. Avoid setting these directives until a spike is imminent (difficult
> to do) or in progress, and leave them only so long as the spike
> persists.
> 
> In case (1) you are flat-out wasting resources and complicating
> Django's operation, by holding resources which are not used and
> mandating more complex logic for obtaining those resources. In nearly
> all cases this is a bad trade-off to make.

What ressources are held and wasted exactly?
Maintaining a number of open TCP connection is much cheaper
than creating/discarding them at a high rate.

I agree that django's CGI-style mode of operation might make
implementation tricky (separate thread?) but you can't seriously
suggest that creating/discarding n connections per second would
be cheaper that maintaining, say, n*10 long-lived connections?

Predictability is the keyword here. From the perspective of my database
(or pgpool instance) I want to be sure that the configured maximum
number of inbound connections can never be exceeded because clients
(such as django) should never get into the uncomfortable situation
of having to deal with a "connection refused".

"Fail-fast" as by django just doesn't work so well on the frontend.
Users don't like error pages at all, they're a surefire way to damage
your reputation. Yes, slow load times are bad too, but still a much more
comfortable position to be in during temporary rush hours.

(ever had some marketing droid yell at you in his highest pitch because 
 10% of their expensive click-throughs went to http-500 hell? ;-) )

> In case (2) the first option really isn't possible, because you can't
> reliably predict traffic spikes in advance. This leaves the second
> option, which requires you to be constantly watching the number of
> database connections in use and involves shutting down your
> application temporarily in order to insert the necessary configuration
> directives. It is also unlikely that you will be able to do so before
> at least some users have received error pages.

Hmm. Dynamic adaption of the pool config is an interesting subject
(java c3p0 can actually do it at runtime, within limits, i think) but
totally out of my scope here. I think a fixed pool config would suffice
to achieve my goal of "graceful behaviour under load".

> So you must either waste resources, or accept increased monitoring
> overhead and the inevitability that some requests will not receive
> successful responses.
> 
> Add to this the following disadvantages:
> 
> * More complex configuration of Django (and hence more potential for
> configuration error).

Oh c'mon. A connection pool is not so complicated.

> * More complex code base for Django (and hence more operating overhead
> and more potential bugs).

Well, I'd think that the constant flux of connections
causes more overhead than even the sloppiest implementation
of a pool ever could. Creating a TCP connection is not free
and then there's the DB handshake on top.

Out of curiosity: Does django create a new connection for each query or
for each http-request?

And about code complexity...  Yes, tying the thing to the django
execution model might take a bit of thinking. But the pool itself 
should be a 20-liner, we're talking python, right? ;-)

> * Loss of flexibility in that the application layer must now possess
> more information about the database layer.

Oh c'mon again...

> * Loss of flexibility in that Django must now maintain persistent
> resources between requests, and have some mechanism for equitably
> distributing them across arbitrarily-many server processes on one or
> more instances of the HTTP daemon on one or more virtual or physical
> machines.

Sounds like a win to me, what kind of flexibility would be lost here?

> Given all of this, I don't see this level of configuration in the
> application layer as a worthwhile trade-off to make.
> 
> You've mentioned JDBC, which points to one potential conflict between
> your expectations and the reality of Django's design: Java-based web
> applications typically run threaded within a single Java-based server
> instance which is able to contain persist resources across requests
> and manage them equitably across all threads.
> 
> Django, on the other hand, typically runs multi-*process* behind or
> embedded in an HTTP daemon and shares no information across the
> boundaries of those processes, and deliberately seeks to minimize the
> number of things which must be persisted across the boundaries of
> request/response cycles. This is a fundamentally different
> architecture, and so decreases the portability of assumptions from a
> Java-style background. In particular, there is no master Django
> process which can create and maintain a resource pool and allocate
> resources from it to other Django processes (especially since
> processes may be running behind/embedded in other instances of the
> HTTP daemon on the same machine, or on remote machines). Which brings
> up yet another disadvantage:

Well, there'd still be the option of adding a simple "connection retry"
without pooling, although that wouldn't do away with the large
throughput of connections.

> * The configuration of the number of connections Django should
> maintain, and the timeout/retry directives, must be calculated with
> the assumption that they will be applied by each and every process
> independently and without knowledge of any of the others.
> 
> At that point even the calculations involved in determining the
> correct settings are probably impossible, since common server
> arrangements can and will increase or decrease the number of running
> processes on the fly; calculations for Apache/mod_python, for example,
> must be able to equitably and reliably manage your resources for any
> number of independent processes "n" such that StartServers <= n <=
> MaxClients, and must take into account the turnover of processes which
> reach MaxRequestsPerChild. Throw in load-balancing among multiple HTTP
> daemons, and the complexity becomes mind-boggling.
> 
> Based on this, it's even more clear that attempting to do this at the
> application layer is a bad idea.

Well, doing it *this* way is obviously a bad idea. If django were to
persist anything across requests (such as a connection pool) then
ofcourse a long-running thread/process seems like the logical choice.

Although I still wonder if the mod_python stuff really doesn't allow for
any kind of persistence or if django just chose not to use it?

> Since you apparently place an extremely high value on availability of
> a timeout/retry feature, I recommend seeking out an external
> connection manager which offers that feature; you will gain all the
> benefits of that feature while retaining the simplicity and
> flexibility of Django's configuration and shared-nothing architecture.

As said, the particular problem of "users getting error pages instead of
delays" can unfortunately not be solved externally.

I guess the most realistic compromise would be to add a
"connection-retry" without pooling. That way django could keep its
shared-nothing execution model but still become more robust (as in:
not so prone to showing error pages) under load.


-mark



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: *Occasional* PostgreSQL Error

Reply via email to

Re: Occasional PostgreSQL Error