Hi, We have a two node cluster running SLES 9 SP2 connecting directly to an EMC CX300 for storage.
We are using OCFS(OCFS2 DLM 0.99.15-SLES) for the voting disk etc, and ASM for data files. The system has been running until last Friday when the whole cluster went down with the following error messages in the /var/log/messages files : rac1: Jul 7 14:56:23 rac1 kernel: (0,3):o2net_state_change:512 connection to node rac2.globoforce.com num 1 at 198.87.235.246:7777 has been idle for 10 seconds, shutting it down. Jul 7 14:56:23 rac1 kernel: (10042,0):o2net_set_nn_state:414 no longer connected to node rac2.globoforce.com at 198.87.235.246:7777 Jul 7 14:56:56 rac1 kernel: (14410,3):ocfs2_replay_journal:1123 Recovering node 1 from slot 1 on device (8,65) rac2: Jul 7 14:56:24 rac2 kernel: (0,0):o2net_state_change:512 connection to node rac1.globoforce.com num 0 at 198.87.235.244:7777 has been idle for 10 seconds, shutting it down. Jul 7 14:56:24 rac2 kernel: (10201,0):o2net_set_nn_state:414 no longer connected to node rac1.globoforce.com at 198.87.235.244:7777 Jul 7 14:56:42 rac2 kernel: (10201,0):o2net_check_quorum:1468 ERROR: fencing this node because it is connected to a half-quorum of 1 out of 2 nodes which doesn't include the lowest active node 0 Jul 7 14:56:42 rac2 kernel: (10201,0):o2hb_stop_all_regions:1589 ERROR: stopping heartbeat on all active regions. Jul 7 14:56:42 rac2 kernel: Kernel panic: ocfs2 is very sorry to be fencing this system by panicing I opened up an SR with Oracle and they recommended that we upgrade to SLES 9 SP3 because they don't support the OCFS version that we are running. I inquired as to whether this will sort out the problem, but they replied with a very vague answer. Can somebody please shed some light on this : is this version of OCFS that we are running very buggy and causes lots of problems like this? And if we upgrade is it going to sort out the problem, or are we just brining ourselves into "Supported-land" and we can get fixed from there? Also(sorry for all the questions :), when we upgrade, is it just a case of upgrading the kernel and the OCFS rpm's? Thank you for your help in advance...much appreciated!! -- Mark Maiden Systems Administrator Globoforce, Ltd 6 Beckett Way Parkwest Dublin 12 Ireland t: +353 1 625 8812 f: +353 1 625 8880 e: [EMAIL PROTECTED] www.globoforce.com http://guidance.gospelcom.net/answer.htm _______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
