2010/1/15 Damian <[email protected]>:
> Hi,
>
> Every few days, when we experience higher loads we get sqlalchemy's
>
> TimeoutError: QueuePool limit of size 5 overflow 10 reached,
> connection timed out, timeout 30

I presume this is a postgres database client side error message, not
on the database server.

> Along with that I see an increase in (2-3 a minute):
>
> (104)Connection reset by peer: core_output_filter: writing data to the
> network
>
> and
>
>  (32)Broken pipe: core_output_filter: writing data to the network
>
> in my apache error logs.

These errors would normally just indicate that HTTP client severed the
connection before request complete. So, machine bogging down, users
give up and possibly hit reload.

> Having checked over my pylons code a few times, the Session.remove()
> should always be called.

Presumably you are using recent pyscopg2. The remove() method wasn't
properly calling close. Fixed back in 2007.

  http://www.mail-archive.com/[email protected]/msg05485.html

> I'm worried that the broken pipe or
> connection reset by peer mean that remove isn't being called.

Even when client connection fails, mod_wsgi, will as WSGI
specification requires, call close() on the iterable returned by the
WSGI application. So, as long as WSGI application correct cleans up
when close() called in that way, even if not all response could be
returned, then should be okay.

> The server is running mod_wsgi with apaches mpm_worker with the
> following config:
>
> <IfModule mpm_worker_module>
>    StartServers         16
>    MaxClients          480
>    MinSpareThreads      50
>    MaxSpareThreads     300
>    ThreadsPerChild      30
>    MaxRequestsPerChild   0
> </IfModule>
>
> and using mod_wsgi's daemon mode:

Are you serving static media or running non Python web applications on
same Apache?

If not, then worker MPM configuration is creating many more
processes/threads than is needed to handle proxying to available
processes/threads on daemon process side.

>  WSGIDaemonProcess somename user=www-data group=www-data processes=4
> threads=32

Presumably you also have WSGIProcessGroup and are in fact running in
daemon mode.

> Is this somehow overkill?  The server is a well speced quad core with
> 8 gigs of ram and fast hard drives.

Use of 32 threads is possibly overkill. The default is 15 per process
and even that could well be overkill.

If your request times are quick, can usually get away with under 5
threads. The only reason to run more is if you need a buffer due
having some number of long running requests.

> It also runs the database server (postgres).
>
> Has anyone else experienced this kind of problem?  I've cross posted
> this to both the mod_wsgi and sqlalchemy mailing lists - hope that's
> ok as I believe this may be relevant to both groups.

The way I read this is that you have 32 potential threads in a process
which want to access database, but sqlalchemy is set up to only have
10 connections in its connection pool. Thus, on a per process basis,
if things bog down and database is overloaded with requests arriving
quicker than can be handled, then to could run out of connection
resources in pool and exceed that wait time for queued requests
wanting a connection.

Dropping down to 5 threads per mod_wsgi daemon process would avoid
this as then less than number of connections in pool and couldn't
exceed it. When system does bog down, with less threads across all
daemon processes, if all used up, then just means that requests
effectively get queued up within Apache server child processes.
Presuming that backend recovers, then those queue requests will then
in turn be handled.

If the number of requests arriving is sufficient that all the threads
across Apache server child processes also become busy, and the socket
listener queue length is exceeded for main HTTP port on Apache, only
then would clients start to see connection refused.

Dropping down number of daemon threads in this way can therefore
actually be used as a way of throttling connections where it is known
that your database isn't going to be able to handle more than a
certain number of requests at the same time. In other words, rather
than let a large number of requests through and simply overload
database even more and make things worse, the limit, with subsequent
queueing of requests within Apache, allows one to trickle connections
through when in an overloaded state.

Anyway, since I don't know much about sqlalchemy and pyscopg2, that is
my guess at what is happening.

Graham
-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.


Reply via email to