The only time I've ever seen anything remotely like that was in nsunix for
AOLserver 3.0, and 3.2. There was a bug in nsd/drv.c where it tried to
close a socket, setting it to -1 (which it thought denoted INVALID_SOCKET).
Unfortunately it was using the wrong structure and just so happened to be
poking a -1 into a location in which -1 means, oh goody, let's shut the
server down!
If you aren't using nsunix or some odd communications driver that relys on
drv.c/RunDriver (iirc), then that problem should not be affecting you, and
again, I've never seen the behavior you describe.
Jerry
Sean Owen writes:
> The nssock notice is apparently benign. (I found the code after all. I
> needed to include .cpp files in my grep.)
>
> Here's the real problem. Under heavy load, we're consistently getting this
> behavior: The server suddenly kills all its conn threads (but not the
> aolserver system threads), and does not reopen them without being restarted
> manually. This happens across 4 servers, at approximately the same time,
> after about 40 minutes.
>
> Our load test peaks at 15 minutes, and our cache timeouts are set to 60
> seconds. The app continues to work fine until about the 40 minute point
> each time, before deciding to kill all its threads.
>
> No errors are reported.
>
> Does anyone have any idea what could cause this to happen?
>
> Thanks,
> Sean