The following reply was made to PR os-qnx/2142; it has been noted by GNATS.
From: bob ostermann <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: Subject: Re: os-qnx/2142: server becomes unresponsive despite active servers. netstat shows ESTABLISHED Date: Fri, 29 May 1998 07:03:55 -0400 >Date: Fri, 01 May 1998 19:22:57 -0400 >To: [EMAIL PROTECTED] >From: bob ostermann <[EMAIL PROTECTED]> >Subject: Re: os-qnx/2142: server becomes unresponsive despite active servers. netstat shows ESTABLISHED >In-Reply-To: <[EMAIL PROTECTED]> > > >At 12:44 PM 5/1/98 -0000, you wrote: >>[In order for any reply to be added to the PR database, ] >>[you need to include <[EMAIL PROTECTED]> in the Cc line ] >>[and leave the subject line UNCHANGED. This is not done] >>[automatically because of the potential for mail loops. ] >> >> >>Synopsis: server becomes unresponsive despite active servers. netstat shows ESTABLISHED >> >>State-Changed-From-To: open-feedback >>State-Changed-By: jim >>State-Changed-When: Fri May 1 05:44:23 PDT 1998 >>State-Changed-Why: >>Would your lockfile possible be on a NFS mounted >>filesystem? It certainly sounds as if it's a blocking >>issue. Is mod_status enabled? If not, try adding that >>and then looking at status when the server is hung (if >>it lets you) >> >No NFS here. > >Slaying and restarting the server cures the problem- until it hangs again. > >mod status _is_ enabled. I'll check that. > >I meant to update the bug-base and say that I downloaded 1.2.6, compiled it, and it runs without the problem showing itself- same .confs > >When I went to QSSL (QNX tech support) they were puzzled that although netstat showed sessions established, sin fd (a ps work-alike that shows file descriptors owned by a process) did not show fds commensurate with what netstat displayed. > >Further, there did not seem to be enough copies of httpd (nor zombies) in the task list to warrant the number of established sessions that netstat showed. > >I did notice that when the server goes into this hung state, it will _eventually_ clean up these rogue ESTABLISHED sessions. I did this on a development box, with people inside my group, where I could control access to the server a little better and prevent the problem from snowballing. > >It almost seemed as though these ESTABLISHED sessions cleared up as a factor of some timeout setting somewhere. > >Is it possible that somehow child processes are launching, erroring out prematurely, the fds stay allocated by Socket or some parent thread, but not by the process? > >IOW, maybe the fdset that the select() thread uses becomes corrupted somehow, and successive child processes launch, find nothing, exit normally, but leave the fd dangling. > ></train_of_thought> > >what should I look for in the status screen? > >
