Sadly, there's nothing interesting in the logs.  Here's the end of the error and 
access logs from the latest hung server:

[03/May/2004:17:53:33][31889.9225][-sched:idle1-] Notice: starting
[03/May/2004:17:54:28][31889.5124][-conn:dawn::0] Notice: dbdrv: opening database 
'ora8:xxxx'
[03/May/2004:18:02:51][31889.12300][-conn:dawn::5] Notice: dbdrv: opening database 
'ora8:xxxx'

65.31.2.7 - - [03/May/2004:18:51:01 -0400] "GET /nlor/bdid5u49mtnxw HTTP/1.1" 200 67 
"" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
198.178.8.81 - - [03/May/2004:18:51:01 -0400] "GET /nlor/ik7dgi4hmxjki HTTP/1.1" 200 
67 "" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)"

It all looks pretty ordinary.

With 4.0, we did experience the hang-on-shutdown syndrome under Linux, but that does 
appear to be fixed under 4.01.

On the suggestion of a thread/single handling problem, the comment to the following bug

http://sourceforge.net/tracker/index.php?func=detail&aid=858030&group_id=3152&atid=103152

suggests that this was a problem for v3.3 under Linux.  I've tried using 'kill -SEGV' 
on a running AOLserver 4.01 process, and every time the server has terminated and not 
hung.  But, I'm not sure which thread is handling the signal, so there could be a bad 
thread in there.

I presume that the 4.1 CVS HEAD is not ready for use in production.  Is there a way to 
enable additional debug output?

- Fen

-----Original Message-----
From:   Dossy [mailto:[EMAIL PROTECTED]
Sent:   Mon 2004-05-03 15:30
To:     [EMAIL PROTECTED]
Cc:     
Subject:        Re: [AOLSERVER] Intermittent hangs without a message
On 2004.05.03, Fen Tamanaha <[EMAIL PROTECTED]> wrote:
> % ps -ef | grep 17072
> UID        PID  PPID  C STIME TTY          TIME CMD
> nsadmin  17072   900  0 11:54 ?        00:00:02 /apps/aolserver-4.01/bin/nsd -i -t 
> /web/aol-configs/web7-demo-8607.tcl -u nsuser -g nsgroup
> nsadmin  17098 17072  0 11:54 ?        00:00:00 [nsd <defunct>]

I used to see this a lot when running 3.5.x under Linux.  I don't think
I've had this happen since upgrading to 4.x -- however, I'm running CVS
HEAD which is 4.1.

Sadly, I never did figure out what was causing this on 3.5.x.  I have a
feeling under 3.5.x, it was a bug or bad interaction in the nsunix/nsvhr
modules that I didn't have time to track down.  In 4.x, I know there's
an issue with nsd shutdown under Linux where it says it's shutting down,
but the comm. driver times out waiting for the threads to die.

Could be a bad interaction on Linux between signals and threads.  That
pid (17098) is a pseudo-process for Linux threads ... perhaps if one of
the nsd's threads gets a signal it doesn't have a handler for, exactly
what you're seeing could happen.

Wonder what your nsd was doing when this happened ... any clues in the
server log?

-- Dossy



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of 
your email blank.

Reply via email to