My guess is that this is caused by the way the current threadpool code works. I'll have to see if I can find my test data, but here is what I remember off the top of my head.
There are a few causes: 1. use of Ns_CondSignal instead of Ns_CondBroadcast to wakeup threadpool threads. This usually results in the thread which just sent the signal to "wake up" and grab the mutex again, and service another request. This results in the max number of requests per thread being reached for a particular thread and it tries to exit. 2. The exiting thread starts up a replacement thread under certain conditions. Sometimes, with many requests coming in, this new thread will grab the mutex repeatedly, and get into the same condition as #1. However, thread exiting from #1 hasn't yet exited and now its parent is also trying to exit. 3. The basic problem is detecting when to allow threads to exit. For instance, a thread might exit because it has been sitting around for too long. Say it has serviced 10 connections and is supposed to exit at 50. What has happened is that you remove the ability to handle 40 requests. The visible result of this is the inability of the server to maintain threads between the min and max specified for a particular threadpool. (Replacing a thread at thread exit patches over this problem, but causes a different problem). Note that it is hard to demonstrate the bug, I only found it by hammering the server. But the bug inevitably shows up and crashes the server. I also added additional logging code so I was able to track what thread was servicing requests and the configuration of the thread during each request (how many conns had been serviced, etc.). tom jackson 2010/11/18 Björn Þór Jónsson <[email protected]>: > Hi, > After recently upgrading from AOLserver 4.5.0 to 4.5.1 and from > nspostgres-4.0 to nspostgres-4.1 the server is repeatedly crashing (when it > gets hammered by the google bots). The error.log has many entries like > these before the server dies: > > [17/Nov/2010:02:18:42][700.3218660208][-default:6195-] Notice: exiting: > exceeded max connections per thread > [17/Nov/2010:02:18:43][700.3217636208][-default:6193-] Notice: exiting: > exceeded max connections per thread > [17/Nov/2010:02:18:44][700.3219172208][-default:6196-] Notice: exiting: > exceeded max connections per thread > [17/Nov/2010:02:18:45][700.3218148208][-default:6194-] Error: Tcl exception: > adp flush failed: connection closed > abort exception raised > while processing connection #31907: > GET ... > Host: localhost:8006 > ... > nsthreads: pthread_create failed in NsCreateThread: Resource temporarily > unavailable [this is the last line in the log before the crash] > > This is the database section of the AOLserver config file: > ns_section "ns/db/drivers" > ns_param postgres nspostgres.so > ns_section ns/db/pools > ns_param pool1 "Pool 1" > ns_param pool2 "Pool 2" > ns_param pool3 "Pool 3" > ns_section ns/db/pool/pool1 > ns_param maxidle 1000000000 > ns_param maxopen 1000000000 > ns_param connections 5 > ns_param extendedtableinfo true > ns_param driver postgres > ns_param datasource localhost::${db_name} > ns_param user $user_account > ns_section ns/db/pool/pool2 > ns_param maxidle 1000000000 > ns_param maxopen 1000000000 > ns_param connections 5 > ns_param extendedtableinfo true > ns_param driver postgres > ns_param datasource localhost::${db_name} > ns_param user $user_account > ns_section ns/db/pool/pool3 > ns_param maxidle 1000000000 > ns_param maxopen 1000000000 > ns_param connections 5 > ns_param extendedtableinfo true > ns_param driver postgres > ns_param datasource localhost::${db_name} > ns_param user $user_account > ns_section ns/server/${server}/db > ns_param pools "*" > ns_param defaultpool pool1 > > The server is running on Ubuntu 10.04.1 LTS > 2.6.32-25-generic-pae #45-Ubuntu SMP Sat Oct 16 21:01:33 UTC 2010 i686 > GNU/Linux > > Is there anything I should configure differently or has any other ideas what > might be causing this? > > /Björn > > -- > Björn Þór Jónsson > http://bthj.is > > -- > AOLserver - http://www.aolserver.com/ > > To Remove yourself from this list, simply send an email to > <[email protected]> with the > body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: > field of your email blank. > -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to <[email protected]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
