I did not notice any unusual activity in access log or any problem in syslog during or before the time these error were logged. It could well be the kernel issue. I will run this problem with HP Apache support team.
The problem did not happened for long time and not sure what is tiggering it. We might end up having a local modification in mod_cgid.c to check for ECONNABORTED before I could put mod_cgid module back in. Just have to make sure the daemon will be relaunched taking on requests without problem if that happens. Thanks, Amol ----- Original Message ---- From: Jeff Trawick <[EMAIL PROTECTED]> To: [email protected] Sent: Sunday, March 18, 2007 6:05:33 AM Subject: Re: mod_cgid and accept() loop On 3/17/07, Amol Dev <[EMAIL PROTECTED]> wrote: > After running the Apache-2.0.58 server on mod_cgid on HPUX B.11.23 PA for 3-4 > days all of sudden I see the following errors in error_log. > > "[Fri Mar 16 07:23:53 2007] [error] (231)Software caused connection abort: > Error accepting on cgid socket" > > There were 18 millons such entries in 30 minutes which mean the cgid daemon > was under infinite loop. len = sizeof(unix_addr); sd2 = accept(sd, (struct sockaddr *)&unix_addr, &len); if (sd2 < 0) { if (errno != EINTR) { ap_log_error(APLOG_MARK, APLOG_ERR, errno, (server_rec *)data, "Error accepting on cgid socket"); } continue; } > Error '231' is ECONNABORTED, which is not handled by mod_cgid and puts the >accept() into infinite loop. no, ECONNABORTED will generate a log message and go back into accept and wait for a new connection; it takes an infinite number of such connections (or kernel acting like there is) to create an infinite loop there perhaps the kernel is confused? some unknown glitch caused a connection to be aborted once, and kernel has left it on an internal queue even after accept() is called? > Not sure why would this socket be shutdown() by anything. But if it does get >ECONNABORTED how should mod_cgid handle it? It handles it correctly today IMHO. Without information on root cause of the kernel acting like there is an endless number of aborted connections to the mod_cgid socket, I wouldn't suggest any change to Apache. > Should we handle this error by setting daemon_should_exit++? Does that > respawn >new daemon without interruption? You may wish to make a local modification to have the cgid process exit if, for example, 10 consecutive calls to accept() return -1/ECONNABORTED. You may first want to try to catch it happening again and use tusc to see if child process(es) handling request are repeatedly trying to connect to mod_cgid's socket. If they're not doing anything wrong, see about applicable kernel patches. If by chance you're using HP's Apache-based server and have support for it, give them a call. If anybody has heard of this before they would likely be in the know. ____________________________________________________________________________________ We won't tell. Get more on shows you hate to love (and love to hate): Yahoo! TV's Guilty Pleasures list. http://tv.yahoo.com/collections/265
