Hello,

I have two nodes running the 2.6.9-22.0.2.ELsmp kernel and the OCFS2
1.2.1 RPMs.  About once a week, one of the nodes crashes itself (self-
fencing) and I get a full vmcore on my netdump server.  The netdump log
file shows the shared filesystem LUN (/dev/dm-6) did not respond within
12000ms.  I have not changed the default heartbeat values
in /etc/sysconfig/o2cb.  There was no other IO ongoing when this
happens, but they are HP Proliant servers running the Insight Manager
agents.

Why would the heartbeat fail roughly once a week?  Should I open a
bugzilla and upload my netdump log file?

Thanks.

/Brian/
-- 
       Brian Long                      |         |           |
       IT Data Center Systems          |       .|||.       .|||.
       Cisco Linux Developer           |   ..:|||||||:...:|||||||:..
       Phone: (919) 392-7363           |   C i s c o   S y s t e m s


_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to