>>>> During the period of overload, few disks were showing more >>>> than kilobytes/second of read or write, yet iostat revealed that several >>>> disks were continuously at 100%.
When I see this situation on bare-metal hardware, I first suspect a disk problem. Failing disks often lead to this kind of symptom prior to dying completely: they'll perform block-repair operations that cause a drive to handle only a few KB/sec as reported here. And then the symptom goes away for...days? Months? And then--boom, server-fail. Check the drives' health with smartctl. Look for a nagios checker script that can continually monitor smart metrics and temperature. -rich _______________________________________________ bblisa mailing list [email protected] http://www.bblisa.org/mailman/listinfo/bblisa
