Sadly, there's nothing interesting in the logs. Here's the end of the error and access logs from the latest hung server:
[03/May/2004:17:53:33][31889.9225][-sched:idle1-] Notice: starting [03/May/2004:17:54:28][31889.5124][-conn:dawn::0] Notice: dbdrv: opening database 'ora8:xxxx' [03/May/2004:18:02:51][31889.12300][-conn:dawn::5] Notice: dbdrv: opening database 'ora8:xxxx' 65.31.2.7 - - [03/May/2004:18:51:01 -0400] "GET /nlor/bdid5u49mtnxw HTTP/1.1" 200 67 "" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)" 198.178.8.81 - - [03/May/2004:18:51:01 -0400] "GET /nlor/ik7dgi4hmxjki HTTP/1.1" 200 67 "" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)" It all looks pretty ordinary. With 4.0, we did experience the hang-on-shutdown syndrome under Linux, but that does appear to be fixed under 4.01. On the suggestion of a thread/single handling problem, the comment to the following bug http://sourceforge.net/tracker/index.php?func=detail&aid=858030&group_id=3152&atid=103152 suggests that this was a problem for v3.3 under Linux. I've tried using 'kill -SEGV' on a running AOLserver 4.01 process, and every time the server has terminated and not hung. But, I'm not sure which thread is handling the signal, so there could be a bad thread in there. I presume that the 4.1 CVS HEAD is not ready for use in production. Is there a way to enable additional debug output? - Fen -----Original Message----- From: Dossy [mailto:[EMAIL PROTECTED] Sent: Mon 2004-05-03 15:30 To: [EMAIL PROTECTED] Cc: Subject: Re: [AOLSERVER] Intermittent hangs without a message On 2004.05.03, Fen Tamanaha <[EMAIL PROTECTED]> wrote: > % ps -ef | grep 17072 > UID PID PPID C STIME TTY TIME CMD > nsadmin 17072 900 0 11:54 ? 00:00:02 /apps/aolserver-4.01/bin/nsd -i -t > /web/aol-configs/web7-demo-8607.tcl -u nsuser -g nsgroup > nsadmin 17098 17072 0 11:54 ? 00:00:00 [nsd <defunct>] I used to see this a lot when running 3.5.x under Linux. I don't think I've had this happen since upgrading to 4.x -- however, I'm running CVS HEAD which is 4.1. Sadly, I never did figure out what was causing this on 3.5.x. I have a feeling under 3.5.x, it was a bug or bad interaction in the nsunix/nsvhr modules that I didn't have time to track down. In 4.x, I know there's an issue with nsd shutdown under Linux where it says it's shutting down, but the comm. driver times out waiting for the threads to die. Could be a bad interaction on Linux between signals and threads. That pid (17098) is a pseudo-process for Linux threads ... perhaps if one of the nsd's threads gets a signal it doesn't have a handler for, exactly what you're seeing could happen. Wonder what your nsd was doing when this happened ... any clues in the server log? -- Dossy -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
