[Ocfs2-users] problems with ocfs2

Sr Cifra Tue, 21 Mar 2006 16:30:25 -0800

Hi, just installed 10g RAC on ocfs2 with 2 nodes, on an RHEL AS 4 x86_64 server (4gb quad opteron).

Everything seemed ok until the DBA started to build the database and do some heavy operations to it. Node 0 on the cluster kernel panics after this console message:

(6,0):o2hb_write_timeout:164
ERROR: Heartbeat write timeout to device dm-0 after 12001 milliseconds
(6,0):o2hb_stop_all_regions:1673
ERROR: stopping heartbeat on all active regions.
Kernel panic - not syncing: ocfs2 is very sorry to be fencing this system by panicking

on Node 1, this was on the console:

(2585,1): o2net_set_nn_state:421
no longer connected to node DC1ORA01 at 192.168.79.169:7777
(32763,1):ocfs2_replay_journal:1123 Recovery node 0 from slot 0 on device (253,0)

and Node 1's OS was barely responsive and wouldn't shutdown cleanly.

The DBA said Oracle was creating numerous trace dumps due to I/O errors, especially during heavy load. Where do these errors point to? storage drivers? ocfs2 bugs? incompatibilities with 10g RAC and ocfs2? Where do I start here?

Oh, device dm-0 is a standard logical volume made up of 3 physical volumes from a SAN array. I downloaded the 10g install disk images to it and installed just fine, so the storage appears to be working properly and does for other server environments.

_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users

[Ocfs2-users] problems with ocfs2

Reply via email to