So I'm using lighttpd and fast_cgi, which occasionally has a problem where it gets 'stuck'. (Unable to bring fast_cgi back to life, even though resources are once again available.) Usually this results in Error 500s that never go away until lighttpd is restarted.
So to avoid having to manually go in and resurrect the server, I created a shell script that tries to hit the site, checks for an HTTP 200 response, and if it doesn't see that, it does a 'tail' of the access and error logs (so that I can see what was happening at the time), and then invokes an "/etc/init.d/lighttpd restart" to kick the server. I've got the following crontab entry: */2 * * * * root THE_SCRIPT meaning it should run once every 2 minutes, all the time. I only get an email when I produces output, and it only does that if it fails to contact the webserver. However, when it does fail, I get numerous reports at once. Could this be because the server isn't responding immediately when I check the status? I'm doing that via, in the shell script: STATUS=`wget --save-headers http://www.MYSITE.com/ -O - 2> /dev/null | head -1 | cut -d " " -f 2` In other words, hit the site, save the headers, save them out to stdout, chop off the "HTTP/1.1" to get the delicious "200" (hopefully) status. I guess maybe I need to give it a "--timeout" argument, and something less than 120 seconds, so that the jobs don't run over each other...? -bill! _______________________________________________ vox-tech mailing list [email protected] http://lists.lugod.org/mailman/listinfo/vox-tech
