Nicholas Leippe wrote:
We have a webserver that occasionally gets pegged. When this happens, the
remote shell becomes nearly totally unresponsive--to the point that we have
it rebooted at the colo. We already have swap disabled. We are trying to
pinpoint which bit of our application is causing this. Obviously, a hard
reset kills any chance of it completing and writing log files with useful
clues.
You might try putting the log files on a separate hard drive. Linux
should be able to complete writes to the logging drive even when the
main drive is swamped. Separate partitions isn't enough; you need
separate spindles. Or, even better, use syslog to send the logs to
another server.
Is there some way to lock a root shell such that it is always responsive so
that we could at least kill the webserver and have a chance of the log files
being fully written to find the problem?
RT scheduling? Tell bash to mlock() itself into memory? Ideas?
You might consider writing a script that monitors the load level. When
it rises too high, automatically suspend your web server processes using
"kill -19". Then you can analyze the problem without actually killing
the server.
Shane
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/