Greg Troxel <[email protected]> writes: > Tom Ivar Helbekkmo <[email protected]> writes: > >> Just a quick observation, from a three minute hang just now: the clock >> on the machine is running three minutes late, and ntpd hasn't noticed >> yet, since it's on a 1024 second schedule with its peers... ;) > > That makes me suspect the kernel was stuck at some high priority level, > > Check "vmstat -m" for failures.
After some more experimenting and tuning, I had some really serious hangs last night, while the system was very busy. I wasn't present at the time, so I didn't get to break into the kernel debugger and look at tracebacks, but afterwards, 'vmstat -m' shows: Name Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle buf16k 16384 62434 7 48435 10071 6145 3926 5154 1 1 0 ...and 'vmstat -s': 130493 faults relock (130484 ok) Does that suggest any further tests? -tih -- It doesn't matter how beautiful your theory is, it doesn't matter how smart you are. If it doesn't agree with experiment, it's wrong. -Richard Feynman
