On 20 October 2010 10:39, Patrick Michael Kane <[email protected]> wrote:
> Hey Graham:
>
> Replies inline:
>
> On Tue, Oct 19, 2010 at 4:21 PM, Graham Dumpleton <
> [email protected]> wrote:
>
>> On 20 October 2010 00:54, Patrick Michael Kane <[email protected]>
>> wrote:
>> > Hey all:
>> >
>> > We're running a Django site using Apache+mod_wsgi.  The stack is
>> > pretty typical: Apache 2.2.14, mod_wsgi 3.3, Python 2.6.1.  We're
>> > running single-threaded using DaemonProcess (WSGIDaemonProcess staging
>> > threads=1 processes=8 maximum-requests=10 python-path=[path] display-
>> > name=%{GROUP}).
>> >
>> > We're seeing an issue where the Apache server will stop responding to
>> > all requests (even non-WSGI ones) after a fairly large number of
>> > requests.  When we strace the wsgi processes, they are all in poll(),
>> > waiting to be talked to:
>> >
>> > poll([{fd=3, events=POLLIN}], 1, -1...
>> >
>> > When we look at the httpds, they are blocking on connect to the
>> > mod_wsgi unix socket:
>> >
>> > connect(69, {sa_family=AF_FILE, path="/home/actionkit/releases/stable/
>> > apache/logs/.1873.28.7.sock"}, 110
>> >
>> > A restart fixes it and then the problem goes away for a few days to a
>> > week.  We're seeing the same behavior across 3 identical webservers.
>> >
>> > The problem happens once every few days.
>> >
>> > If folks have any ideas on the cause, I'm all ears.  Alternatively, if
>> > there's additional steps we can take to debug that would be helpful,
>> > let me know.
>>
>> Can you provide the output of running -V option on Apache. Eg:
>>
>>  /usr/sbin/httpd -V
>>
>> Want to verify what MPM you are using and some of the compilation options.
>>
>
> Server version: Apache/2.2.3
> Server built:   Jan 21 2009 22:00:55
> Server's Module Magic Number: 20051115:3
> Server loaded:  APR 1.2.7, APR-Util 1.2.7
> Compiled using: APR 1.2.7, APR-Util 1.2.7
> Architecture:   64-bit
> Server MPM:     Prefork
>  threaded:     no
>    forked:     yes (variable process count)
> Server compiled with....
>  -D APACHE_MPM_DIR="server/mpm/prefork"
>  -D APR_HAS_SENDFILE
>  -D APR_HAS_MMAP
>  -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
>  -D APR_USE_SYSVSEM_SERIALIZE
>  -D APR_USE_PTHREAD_SERIALIZE

The above two definitions mean that when 'default' accept mutex type
is used that 'sysvsem' is used. This is the one I am worried about.

In your Apache configuration set:

  WSGIAcceptMutex flock

and see how things go.

I'll be updating mod_wsgi code in trunk to deal with problem I saw and
add extra debugging to try and highlight when problem occurs and work
out why it might be happening.

>  -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
>  -D APR_HAS_OTHER_CHILD
>  -D AP_HAVE_RELIABLE_PIPED_LOGS
>  -D DYNAMIC_MODULE_LIMIT=128
>  -D HTTPD_ROOT="/etc/httpd"
>  -D SUEXEC_BIN="/usr/sbin/suexec"
>  -D DEFAULT_PIDLOG="logs/httpd.pid"
>  -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
>  -D DEFAULT_LOCKFILE="logs/accept.lock"
>  -D DEFAULT_ERRORLOG="logs/error_log"
>  -D AP_TYPES_CONFIG_FILE="conf/mime.types"
>  -D SERVER_CONFIG_FILE="conf/httpd.conf"
>
> As to strace on WSGI processes, really need a gdb stack trace on all
>> threads if you can manage to get one.
>>
>> I need to see if the actual request handler threads have exited. My
>> latest theory is that they might have exited. The poll() may be the
>> main thread which is just waiting for shutdown indicator to be sent
>> via a socketpair from signal handler.
>
>
> I can get a backtrace, but we're not compiled with -g in our
> production
> environment.  Is a non-debugging backtrace going to be useful?

So long as not -O compiled, should still show functions.

As documented in:

  
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Debugging_Crashes_With_GDB

use:

  thread apply all bt

> It
> would be
> possible for us to compile mod_wsgi with -g, if that were helpful, but
> redeploying our Apache with debugging would be a tall order.
>
> Since we're running single-threaded, is a single process is ok?

If you mean single daemon process, probably not as means requests are
sequential and can't handle concurrent requests, which depending on
application and amount of traffic may be bad.

>> Can you also add to your Apache configuration files at global scope:
>>
>>  WSGIAcceptMutex xxx
>>
>>
> Accept mutex lock mechanism 'xxx' is invalid. Valid accept mutex
> mechanisms
> for this platform are: default, flock, fcntl, sysvsem, pthread.
>
>>
>> Do the same thing for the directive:
>>
>>  AcceptMutex xxx
>>
>>
> xxx is an invalid mutex mechanism; Valid accept mutexes for this
> platform
> and MPM are: default, flock, fcntl, sysvsem, pthread.
>
>
>> BTW, any reason you are running maximum-requests so low? One of the
>> observations in the past is that the problem is made worse when WSGI
>> processes are being restarted a lot as would be case for small number
>> of maximum requests.
>
> Memory leaks.

Hmmm, pretty severe leaks if has to be that low.

Do you know what is causing it and whether affects all URLs or only
code used by some URLs?

What one could do is create multiple daemon process groups and
delegate just the subset of URLs which exhibit memory creap/leaks to
daemon process group of their own and set maximum requests low on
that, but for every else in original daemon process group, eliminate
maximum requests and allow process to stay resident all the time.
Would give better overall performance as not restarting processes and
having to reload application all the time.

Graham

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Reply via email to