I believe I've found a bug in queue.c which occasionally causes a core dump at exit.

The bug is caused by this order of events at the end of NsConnThread:

    poolPtr->threads.idle--;
    poolPtr->threads.current--;
    if (poolPtr->threads.current == 0) {
        Ns_CondBroadcast(&poolPtr->cond);
    }
    Ns_ThreadExit(dataPtr);

the Ns_ThreadExit doesn't always get to complete, because poolPtr- >threads.current can be set to zero before Ns_ThreadExit concludes, and thus nsd exits (since the server is waiting threads.current to reach zero) while the thread exit code is still running.

I believe the fix to this is to move the thread exit call (Ns_ThreadExit) to immediately *before* the decrementing of the current threads count, ie like so:

    Ns_ThreadExit(dataPtr);

    poolPtr->threads.idle--;
    poolPtr->threads.current--;
    if (poolPtr->threads.current == 0) {
        Ns_CondBroadcast(&poolPtr->cond);
    }

in my stress tests, this led to a successful nsd shutdown, even while the server was being pounded by http requests, whereas before I regularly crashed.

I found this bug by adding these log statements around the thread exit:

Ns_Log(Notice, "starting thread exit");
Ns_ThreadExit(dataPtr);
Ns_Log(Notice, "ending thread exit");

if you do this, and try to exit nsd during a stress test, you will find that several connection threads display the starting message, but never display the ending message. Sometimes this causes a crash, sometimes not, but it's always bad form.

-john


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> 
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Reply via email to