I had an instance today on several servers where the load average soared, and all of my apache processes were in uninterruptible sleep state.
I did run scanlocks, and the ps command as requested: On one server some apache processes looked like this: 3345 D apache2 dlm_wait_for_recovery On another, the output was completely different. Nonetheless, the following threads on one server were using 100% cpu: [o2net] [o2hb-BC778ACE98] [dlm_thread] That box needed to be rebooted, and things recovered. The output of scanlocks: forum3 ~ # ./scanlocks /dev/sdb1 M000000000000000000002b7976e45d /dev/sdb1 O000000000000000242b4bb00000000 /dev/sdb1 O0000000000000000dd506300000000 /dev/sdb1 O0000000000000000f898b400000000 /dev/sdb1 O0000000000000001b23aa000000000 /dev/sdb1 O000000000000000133e0c200000000 /dev/sdb1 O00000000000000011c6ea100000000 /dev/sdb1 N00000000013fffb30140013c /dev/sdb1 N000000000039292b016d640d /dev/sdb1 O0000000000000000f8999e00000000 /dev/sdb1 N000000000042fb6002ff2763 /dev/sdb1 O0000000000000002d5d64300000000 /dev/sdb1 O0000000000000001e639b100000000 /dev/sdb1 O0000000000000000398aa700000000 /dev/sdb1 O0000000000000000c139fd00000000 /dev/sdb1 O00000000000000014df4ab00000000 /dev/sdb1 D0000000000000000815e6ed5b0763f forum ~ # ./scanlocks /dev/sdd1 O0000000000000000dd506300000000 /dev/sdd1 M0000000000000000c102c700000000 /dev/sdd1 M000000000000000039d65600000000 /dev/sdd1 S000000000000000000000200000000
_______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
