>>>> During the period of overload, few disks were showing more
>>>> than kilobytes/second of read or write, yet iostat revealed that several
>>>> disks were continuously at 100%.

When I see this situation on bare-metal hardware, I first suspect a disk 
problem. Failing disks often lead to this kind of symptom prior to dying 
completely: they'll perform block-repair operations that cause a drive to 
handle only a few KB/sec as reported here. And then the symptom goes away 
for...days? Months? And then--boom, server-fail.

Check the drives' health with smartctl. Look for a nagios checker script that 
can continually monitor smart metrics and temperature.

-rich
_______________________________________________
bblisa mailing list
[email protected]
http://www.bblisa.org/mailman/listinfo/bblisa

Reply via email to