Tom Jackson schrieb:
On Friday 19 October 2007 09:12, Gustaf Neumann wrote:
Actually, if minthreads is set to a value > 0 (it was not set), then idle
threads should care about queued requests. Maybe the
easier fix is to set minthreads per default to 1. I will
try this, when i am back home (in a couple of hours).
AOLserver does not respect minthreads except at startup. This is part of the
same issue: Nothing except a request can create a thread. It seems like on
thread exit, a little accounting process could go on to bring threads.current
up to threads.min, this might require more than one thread creation, maybe
not.
fully agreed.
i continued a little on the isue and commited a patch
to CVS, which respects
maxconns always. Instead of performing
in boundary situations in the worst case more than the
configured maxconns requests, the code creates now
a new connection thread automatically after the exit of a thread
coming to the end of it work cycle, when jobs are pending
and no other thread is able to process these.
i made as well a small variant of this (not in CVS), which
defines a new command NsCheckPools() in pools.c
which iterates over pools and starts a thread in the
same situation as above.
NsCheckPools could be extended to check for the existance of
minthreads etc. However, for this patch, i wanted to be
as little invasive as necessary, and added the thread-revival
code to the place, where a thread exits.
For the new thread generation, i had to parameterize
NsCreateConnThread() to avoid a resource deadlock
in Ns_ThreadJoin().
In your patch, you change the while loop test:
+ while (poolPtr->threads.maxconns <= 0
+ || ncons-- > 0
+ || poolPtr->queue.wait.num > 1) {
Shouldn't the loop continue with poolPtr->queue.wait.num > 0 ?
0 or 1, both is fine here. the condition trigger for the new cases only
when > 1 is used.
Your patch looks like a great fix...I just still don't understand why the
server would completely lock up. As long as you have requests coming in,
seems like a new thread would get created. I wonder if what is happening is
that Apache Bench simple stops sending new requests because none of the
requests are finishing. If it can block, I wonder if simply visiting with a
browser would kick things off again?
one can use the browser to kick things off again, but if there are
already say 50 requests
in the queue, and the browser hangs in the first request. The bad thing
is, that in the
meantime, some other bulky requests are queued as well. So the queue
might be
acutally permanently growing.
for my testing, i use:
ab -c 20 -n 35 http://localhost:8003/file-storage/view/dotlrn-2.3.1.tgz
with maxthreads 5 and maxconns 3 I get reliable hangs, where
the last thread exists with about 5 to 15 queued requests. Without the
recent patches, the queue is processed until it is empty.
btw, the reason, when the patch helps, is no miracle. It is completely
clear why the server hangs. This bug is not a classical deadlock (it
is possible that the queued reuests are processed), but shares some
properties of a life lock (some other requests prevent the processing
of some other requests, at least for a while). It is not a race condition
either. I don't believe that TriggerDriver() is the right place to
handle the problem, since the driver is happily accepting requests
in the bug situation. It can certainly be, that the fixed bug is different
from the bug the Jeff Rogers fixed originally.
-gustaf neumann
PS: i am pretty sure that this is the same bug as on openacs.org.
Around the time of the bug, somebody in spain was apparently
benchmarking his internet connection, downloading dotlrn*.tar*
in multiple parallel sessions. Once i saw that the user
was trying to download 1200 copies of dotlrn.tar in about 10 minutes.
--
AOLserver - http://www.aolserver.com/
To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]>
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject:
field of your email blank.