My guess is that this is caused by the way the current threadpool code
works. I'll have to see if I can find my test data, but here is what I
remember off the top of my head.

There are a few causes:

1. use of Ns_CondSignal instead of Ns_CondBroadcast to wakeup
threadpool threads. This usually results in the thread which just sent
the signal to "wake up" and grab the mutex again, and service another
request. This results in the max number of requests per thread being
reached for a particular thread and it tries to exit.
2. The exiting thread starts up a replacement thread under certain
conditions. Sometimes, with many requests coming in, this new thread
will grab the mutex repeatedly, and get into the same condition as #1.
However, thread exiting from #1 hasn't yet exited and now its parent
is also trying to exit.
3. The basic problem is detecting when to allow threads to exit. For
instance, a thread might exit because it has been sitting around for
too long. Say it has serviced 10 connections and is supposed to exit
at 50. What has happened is that you remove the ability to handle 40
requests. The visible result of this is the inability of the server to
maintain threads between the min and max specified for a particular
threadpool. (Replacing a thread at thread exit patches over this
problem, but causes a different problem).

Note that it is hard to demonstrate the bug, I only found it by
hammering the server. But the bug inevitably shows up and crashes the
server. I also added additional logging code so I was able to track
what thread was servicing requests and the configuration of the thread
during each request (how many conns had been serviced, etc.).

tom jackson



2010/11/18 Björn Þór Jónsson <[email protected]>:
> Hi,
> After recently upgrading from AOLserver 4.5.0 to 4.5.1 and from
> nspostgres-4.0 to nspostgres-4.1 the server is repeatedly crashing (when it
> gets hammered by the google bots).  The error.log has many entries like
> these before the server dies:
>
> [17/Nov/2010:02:18:42][700.3218660208][-default:6195-] Notice: exiting:
> exceeded max connections per thread
> [17/Nov/2010:02:18:43][700.3217636208][-default:6193-] Notice: exiting:
> exceeded max connections per thread
> [17/Nov/2010:02:18:44][700.3219172208][-default:6196-] Notice: exiting:
> exceeded max connections per thread
> [17/Nov/2010:02:18:45][700.3218148208][-default:6194-] Error: Tcl exception:
> adp flush failed: connection closed
>     abort exception raised
>     while processing connection #31907:
>         GET ...
>         Host: localhost:8006
> ...
> nsthreads: pthread_create failed in NsCreateThread: Resource temporarily
> unavailable   [this is the last line in the log before the crash]
>
> This is the database section of the AOLserver config file:
> ns_section "ns/db/drivers"
> ns_param postgres nspostgres.so
> ns_section ns/db/pools
>     ns_param   pool1              "Pool 1"
>     ns_param   pool2              "Pool 2"
>     ns_param   pool3              "Pool 3"
> ns_section ns/db/pool/pool1
>     ns_param   maxidle            1000000000
>     ns_param   maxopen            1000000000
>     ns_param   connections        5
>     ns_param   extendedtableinfo  true
>     ns_param   driver             postgres
>     ns_param   datasource         localhost::${db_name}
>     ns_param   user               $user_account
> ns_section ns/db/pool/pool2
>     ns_param   maxidle            1000000000
>     ns_param   maxopen            1000000000
>     ns_param   connections        5
>     ns_param   extendedtableinfo  true
>     ns_param   driver             postgres
>     ns_param   datasource         localhost::${db_name}
>     ns_param   user               $user_account
> ns_section ns/db/pool/pool3
>     ns_param   maxidle            1000000000
>     ns_param   maxopen            1000000000
>     ns_param   connections        5
>     ns_param   extendedtableinfo  true
>     ns_param   driver             postgres
>     ns_param   datasource         localhost::${db_name}
>     ns_param   user               $user_account
> ns_section ns/server/${server}/db
>     ns_param   pools              "*"
>     ns_param   defaultpool        pool1
>
> The server is running on Ubuntu 10.04.1 LTS
> 2.6.32-25-generic-pae #45-Ubuntu SMP Sat Oct 16 21:01:33 UTC 2010 i686
> GNU/Linux
>
> Is there anything I should configure differently or has any other ideas what
> might be causing this?
>
> /Björn
>
> --
> Björn Þór Jónsson
> http://bthj.is
>
> --
> AOLserver - http://www.aolserver.com/
>
> To Remove yourself from this list, simply send an email to
> <[email protected]> with the
> body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject:
> field of your email blank.
>


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to 
<[email protected]> with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Reply via email to