We are running OCFS2 on SLES9 machines using a FC SAN. Without warning both nodes will become unresponsive. Can not access either machine via ssh or terminal (hangs after typing in username). However the machine still responds to pings. This continues until one node is rebooted, at which time the second node resumes normal operations.
I am not entirely sure that this is an OCFS2 problem at all however the syslog shows it had issues Here is the log from the node that was not rebooted. The node that was rebooted contained no log information. The system appeared to have gone down at about 3AM, until the node was rebooted at around 7:15. Mar 8 03:06:32 groupwise-1-mht kernel: o2net: connection to node groupwise-2-mht (num 2) at 192.168.1.3:7777 has been idle for 10 seconds, shutting it down. Mar 8 03:06:32 groupwise-1-mht kernel: (0,2):o2net_idle_timer:1310 here are some times that might help debug the situation: (tmr 1173341182.367220 now 1173341192.367244 dr 1173341182.367213 adv 1173341182.367228:1173341182.367229 func (05ce6220:2) 1173341182.367221:1173341182.367224) Mar 8 03:06:32 groupwise-1-mht kernel: o2net: no longer connected to node groupwise-2-mht (num 2) at 192.168.1.3:7777 Mar 8 03:06:32 groupwise-1-mht kernel: (499,0):dlm_do_master_request:1330 ERROR: link to 2 went down! Mar 8 03:06:32 groupwise-1-mht kernel: (499,0):dlm_get_lock_resource:914 ERROR: status = -112 Mar 8 03:13:02 groupwise-1-mht kernel: (8476,0):dlm_send_proxy_ast_msg:458 ERROR: status = -107 Mar 8 03:13:02 groupwise-1-mht kernel: (8476,0):dlm_flush_asts:607 ERROR: status = -107 Mar 8 03:19:54 groupwise-1-mht kernel: (147,1):dlm_send_remote_unlock_request:356 ERROR: status = -107 Mar 8 03:19:54 groupwise-1-mht last message repeated 127 times Mar 8 03:19:55 groupwise-1-mht kernel: (873,0):dlm_do_master_request:1330 ERROR: link to 2 went down! Mar 8 03:19:55 groupwise-1-mht kernel: (873,0):dlm_get_lock_resource:914 ERROR: status = -107 Mar 8 03:19:55 groupwise-1-mht kernel: (901,0):dlm_do_master_request:1330 ERROR: link to 2 went down! Mar 8 03:19:55 groupwise-1-mht kernel: (901,0):dlm_get_lock_resource:914 ERROR: status = -107 Mar 8 03:19:56 groupwise-1-mht kernel: (929,0):dlm_do_master_request:1330 ERROR: link to 2 went down! Mar 8 03:19:56 groupwise-1-mht kernel: (929,0):dlm_get_lock_resource:914 ERROR: status = -107 Mar 8 03:45:29 groupwise-1-mht -- MARK -- Mar 8 04:15:02 groupwise-1-mht kernel: (147,1):dlm_send_remote_unlock_request:356 ERROR: status = -107 Mar 8 04:15:03 groupwise-1-mht last message repeated 383 times Mar 8 06:27:54 groupwise-1-mht kernel: (147,1):dlm_send_remote_unlock_request:356 ERROR: status = -107 Mar 8 06:27:54 groupwise-1-mht last message repeated 127 times Mar 8 06:27:54 groupwise-1-mht kernel: (147,1):dlm_send_remote_unlock_request:356 ERROR: status = -107 Mar 8 06:27:54 groupwise-1-mht last message repeated 127 times Mar 8 06:35:48 groupwise-1-mht kernel: (8872,0):dlm_do_master_request:1330 ERROR: link to 2 went down! Mar 8 06:35:48 groupwise-1-mht kernel: (8872,0):dlm_get_lock_resource:914 ERROR: status = -107 Mar 8 06:52:45 groupwise-1-mht kernel: (8861,0):dlm_do_master_request:1330 ERROR: link to 2 went down! Mar 8 06:52:45 groupwise-1-mht kernel: (8861,0):dlm_get_lock_resource:914 ERROR: status = -107 Mar 8 06:54:11 groupwise-1-mht kernel: (8854,3):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 06:54:11 groupwise-1-mht kernel: (8854,3):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 06:54:11 groupwise-1-mht kernel: (8854,3):ocfs2_unlink:840 ERROR: status = -107 Mar 8 06:54:18 groupwise-1-mht kernel: (8855,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 06:54:18 groupwise-1-mht kernel: (8855,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 06:54:18 groupwise-1-mht kernel: (8855,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 06:54:18 groupwise-1-mht kernel: (8855,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 06:54:18 groupwise-1-mht kernel: (8855,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 06:54:18 groupwise-1-mht kernel: (8855,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 06:54:58 groupwise-1-mht kernel: (8853,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 06:54:58 groupwise-1-mht kernel: (8853,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 06:54:58 groupwise-1-mht kernel: (8853,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 07:09:41 groupwise-1-mht kernel: (4192,0):dlm_do_master_request:1330 ERROR: link to 2 went down! Mar 8 07:09:41 groupwise-1-mht kernel: (4192,0):dlm_get_lock_resource:914 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 07:14:09 groupwise-1-mht kernel: (4236,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 07:15:50 groupwise-1-mht kernel: (4289,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 07:15:50 groupwise-1-mht kernel: (4289,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 07:15:50 groupwise-1-mht kernel: (4289,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 07:15:50 groupwise-1-mht kernel: (4289,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 07:15:50 groupwise-1-mht kernel: (4289,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 07:15:50 groupwise-1-mht kernel: (4289,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 07:16:13 groupwise-1-mht kernel: (4253,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 07:16:13 groupwise-1-mht kernel: (4253,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 07:16:13 groupwise-1-mht kernel: (4253,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 07:18:57 groupwise-1-mht kernel: (4341,0):dlm_do_master_request:1330 ERROR: link to 2 went down! Mar 8 07:18:57 groupwise-1-mht kernel: (4341,0):dlm_get_lock_resource:914 ERROR: status = -107 Mar 8 07:19:24 groupwise-1-mht kernel: (4356,0):ocfs2_broadcast_vote:725 ERROR: status = -107 Mar 8 07:19:24 groupwise-1-mht kernel: (4356,0):ocfs2_do_request_vote:798 ERROR: status = -107 Mar 8 07:19:24 groupwise-1-mht kernel: (4356,0):ocfs2_unlink:840 ERROR: status = -107 Mar 8 07:20:49 groupwise-1-mht sshd[4375]: Accepted publickey for root from 10.1.31.27 port 1752 ssh2 Mar 8 07:20:50 groupwise-1-mht kernel: (147,0):dlm_send_remote_unlock_request:356 ERROR: status = -107 Mar 8 07:20:50 groupwise-1-mht last message repeated 255 times Mar 8 07:20:53 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:20:53 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:20:58 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:20:58 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:03 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:03 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:08 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:08 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:13 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:13 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:19 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:19 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:24 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:24 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:29 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:29 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:34 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:34 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:39 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:39 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:44 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:44 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:49 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:49 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:54 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:54 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:21:59 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:21:59 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:22:04 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:04 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:22:10 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:10 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:22:15 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:20 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:20 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:22:25 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:25 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:22:30 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:30 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:22:35 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:35 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:22:40 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:40 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:22:45 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:45 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:22:50 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:50 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:22:55 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:22:55 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:23:01 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:23:01 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:23:06 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:23:06 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:23:11 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:23:11 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:23:16 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:23:16 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:23:21 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:23:21 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:23:26 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:23:26 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:23:31 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:23:31 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:23:36 groupwise-1-mht kernel: (4377,0):dlm_send_remote_convert_request:398 ERROR: status = -107 Mar 8 07:23:36 groupwise-1-mht kernel: (4377,0):dlm_wait_for_node_death:371 2062CE05ABA246988E9CCCDAE253F458: waiting 5000ms for notification of death of node 2 Mar 8 07:23:40 groupwise-1-mht kernel: (28613,2):dlm_get_lock_resource:847 B6ECAF5A668A4573AF763908F26958DB:$RECOVERY: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:40 groupwise-1-mht kernel: (28613,2):dlm_get_lock_resource:874 B6ECAF5A668A4573AF763908F26958DB: recovery map is not empty, but must master $RECOVERY lock now Mar 8 07:23:41 groupwise-1-mht kernel: (4432,0):ocfs2_replay_journal:1176 Recovering node 2 from slot 1 on device (253,1) Mar 8 07:23:41 groupwise-1-mht kernel: (4192,0):dlm_restart_lock_mastery:1214 ERROR: node down! 2 Mar 8 07:23:41 groupwise-1-mht kernel: (4192,0):dlm_wait_for_lock_mastery:1035 ERROR: status = -11 Mar 8 07:23:41 groupwise-1-mht kernel: (929,1):dlm_restart_lock_mastery:1214 ERROR: node down! 2 Mar 8 07:23:41 groupwise-1-mht kernel: (929,1):dlm_wait_for_lock_mastery:1035 ERROR: status = -11 Mar 8 07:23:42 groupwise-1-mht kernel: (4341,1):dlm_restart_lock_mastery:1214 ERROR: node down! 2 Mar 8 07:23:42 groupwise-1-mht kernel: (4341,1):dlm_wait_for_lock_mastery:1035 ERROR: status = -11 Mar 8 07:23:42 groupwise-1-mht kernel: (4341,1):dlm_restart_lock_mastery:1214 ERROR: node down! 2 Mar 8 07:23:42 groupwise-1-mht kernel: (4341,1):dlm_wait_for_lock_mastery:1035 ERROR: status = -11 Mar 8 07:23:42 groupwise-1-mht kernel: (4192,0):dlm_get_lock_resource:895 2062CE05ABA246988E9CCCDAE253F458:D000000000000000037872ff59e2a10: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:42 groupwise-1-mht kernel: (499,1):dlm_restart_lock_mastery:1214 ERROR: node down! 2 Mar 8 07:23:42 groupwise-1-mht kernel: (499,1):dlm_wait_for_lock_mastery:1035 ERROR: status = -11 Mar 8 07:23:42 groupwise-1-mht kernel: (929,1):dlm_get_lock_resource:895 2062CE05ABA246988E9CCCDAE253F458:M0000000000000002d2ab960a02ee32: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:43 groupwise-1-mht kernel: (4341,1):dlm_get_lock_resource:895 2062CE05ABA246988E9CCCDAE253F458:D00000000000000005ac8f593b44a80: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:43 groupwise-1-mht kernel: (8872,1):dlm_restart_lock_mastery:1214 ERROR: node down! 2 Mar 8 07:23:43 groupwise-1-mht kernel: (8872,1):dlm_wait_for_lock_mastery:1035 ERROR: status = -11 Mar 8 07:23:43 groupwise-1-mht kernel: (499,1):dlm_get_lock_resource:895 2062CE05ABA246988E9CCCDAE253F458:D0000000000000000059e0c78635d25: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:43 groupwise-1-mht kernel: (8223,2):ocfs2_dlm_eviction_cb:119 device (253,0): dlm has evicted node 2 Mar 8 07:23:43 groupwise-1-mht kernel: (4431,0):dlm_get_lock_resource:847 2062CE05ABA246988E9CCCDAE253F458:M000000000000000000001de83f8b74: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:44 groupwise-1-mht kernel: (8872,1):dlm_get_lock_resource:895 2062CE05ABA246988E9CCCDAE253F458:D0000000000000000ce315c7764670d: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:44 groupwise-1-mht kernel: (4431,0):dlm_get_lock_resource:895 2062CE05ABA246988E9CCCDAE253F458:M000000000000000000001de83f8b74: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:44 groupwise-1-mht kernel: (873,1):dlm_restart_lock_mastery:1214 ERROR: node down! 2 Mar 8 07:23:49 groupwise-1-mht kernel: (873,1):dlm_wait_for_lock_mastery:1035 ERROR: status = -11 Mar 8 07:23:49 groupwise-1-mht kernel: (901,1):dlm_restart_lock_mastery:1214 ERROR: node down! 2 Mar 8 07:23:49 groupwise-1-mht kernel: (901,1):dlm_wait_for_lock_mastery:1035 ERROR: status = -11 Mar 8 07:23:49 groupwise-1-mht kernel: (8861,1):dlm_restart_lock_mastery:1214 ERROR: node down! 2 Mar 8 07:23:49 groupwise-1-mht kernel: (8861,1):dlm_wait_for_lock_mastery:1035 ERROR: status = -11 Mar 8 07:23:49 groupwise-1-mht kernel: (873,1):dlm_get_lock_resource:895 2062CE05ABA246988E9CCCDAE253F458:M0000000000000002fc058c0a084a80: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:49 groupwise-1-mht kernel: (901,1):dlm_get_lock_resource:895 2062CE05ABA246988E9CCCDAE253F458:M0000000000000002ff18686a1b86f4: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:49 groupwise-1-mht kernel: (8861,1):dlm_get_lock_resource:895 2062CE05ABA246988E9CCCDAE253F458:D0000000000000000b2f76e77647700: at least one node (2) torecover before lock mastery can begin Mar 8 07:23:49 groupwise-1-mht kernel: kjournald starting. Commit interval 5 seconds Mar 8 07:23:49 groupwise-1-mht kernel: (4431,0):ocfs2_replay_journal:1176 Recovering node 2 from slot 1 on device (253,0) Mar 8 07:23:55 groupwise-1-mht kernel: (fs/jbd/recovery.c, 255): journal_recover: JBD: recovery, exit status 0, recovered transactions 599034 to 599035 Mar 8 07:23:55 groupwise-1-mht kernel: (fs/jbd/recovery.c, 257): journal_recover: JBD: Replayed 8 and revoked 0/0 blocks Mar 8 07:23:55 groupwise-1-mht kernel: kjournald starting. Commit interval 5 seconds Mar 8 07:25:51 groupwise-1-mht kernel: o2net: accepted connection from node groupwise-2-mht (num 2) at 192.168.1.3:7777 Mar 8 07:25:55 groupwise-1-mht kernel: ocfs2_dlm: Node 2 joins domain 2062CE05ABA246988E9CCCDAE253F458 Mar 8 07:25:55 groupwise-1-mht kernel: ocfs2_dlm: Nodes in domain ("2062CE05ABA246988E9CCCDAE253F458"): 0 1 2 Mar 8 07:25:59 groupwise-1-mht kernel: ocfs2_dlm: Node 2 joins domain B6ECAF5A668A4573AF763908F26958DB Mar 8 07:25:59 groupwise-1-mht kernel: ocfs2_dlm: Nodes in domain ("B6ECAF5A668A4573AF763908F26958DB"): 0 1 2 Andy Kipp Network Administrator Velcro USA Inc. 406 Brown Ave. Manchester, NH 03103 Phone: (603) 222-4844 Email: [EMAIL PROTECTED] CONFIDENTIALITY NOTICE: This email is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. If you are the intended recipient but do not wish to receive communications through this medium, please so advise immediately. _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users