Hi Had another node die
Everything is looking good, I am guessing corrosync tried to talk to the other node and it failed, I believe Nov 1 00:08:48 demorp2 ntpd[2461]: peers refreshed Nov 1 00:08:51 demorp2 corosync[2039]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Nov 1 00:08:51 demorp2 corosync[2039]: [CPG ] chosen downlist: sender r(0) ip(10.172.218.52) ; members(old:1 left:0) Nov 1 00:08:51 demorp2 corosync[2039]: [MAIN ] Completed service synchronization, ready to provide service. Nov 1 00:09:05 demorp2 corosync[2039]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Nov 1 00:09:05 demorp2 corosync[2039]: [CMAN ] quorum regained, resuming activity Nov 1 00:09:05 demorp2 corosync[2039]: [QUORUM] This node is within the primary component and will provide service. Nov 1 00:09:05 demorp2 corosync[2039]: [QUORUM] Members[2]: 1 2 Nov 1 00:09:05 demorp2 corosync[2039]: [QUORUM] Members[2]: 1 2 Nov 1 00:09:05 demorp2 crmd[2725]: notice: cman_event_callback: Membership 320: quorum acquired Nov 1 00:09:05 demorp2 crmd[2725]: notice: crm_update_peer_state: cman_event_callback: Node demorp1[1] - state is now member (was lost) Nov 1 00:09:05 demorp2 corosync[2039]: [CPG ] chosen downlist: sender r(0) ip(10.172.218.52) ; members(old:1 left:0) Nov 1 00:09:05 demorp2 corosync[2039]: [MAIN ] Completed service synchronization, ready to provide service. Nov 1 00:09:05 demorp2 crmd[2725]: notice: do_state_transition: State transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL origin=peer_update_callback ] Nov 1 00:09:06 demorp2 corosync[2039]: cman killed by node 1 because we were killed by cman_tool or other application Nov 1 00:09:06 demorp2 attrd[2723]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) Nov 1 00:09:06 demorp2 attrd[2723]: crit: attrd_cs_destroy: Lost connection to Corosync service! Nov 1 00:09:06 demorp2 attrd[2723]: notice: main: Exiting... Nov 1 00:09:06 demorp2 attrd[2723]: notice: main: Disconnecting client 0xdc3020, pid=2725... Nov 1 00:09:06 demorp2 pacemakerd[2712]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) Nov 1 00:09:06 demorp2 pacemakerd[2712]: error: mcp_cpg_destroy: Connection destroyed Nov 1 00:09:06 demorp2 stonith-ng[2721]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) Nov 1 00:09:06 demorp2 crmd[2725]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) Nov 1 00:09:06 demorp2 crmd[2725]: error: crmd_cs_destroy: connection terminated Nov 1 00:09:06 demorp2 gfs_controld[2173]: cluster is down, exiting Nov 1 00:09:06 demorp2 gfs_controld[2173]: daemon cpg_dispatch error 2 Nov 1 00:09:06 demorp2 attrd[2723]: error: attrd_cib_connection_destroy: Connection to the CIB terminated... Nov 1 00:09:06 demorp2 fenced[2098]: cluster is down, exiting Nov 1 00:09:06 demorp2 fenced[2098]: daemon cpg_dispatch error 2 Nov 1 00:09:06 demorp2 dlm_controld[2124]: cluster is down, exiting Nov 1 00:09:06 demorp2 dlm_controld[2124]: daemon cpg_dispatch error 2 Nov 1 00:09:06 demorp2 stonith-ng[2721]: error: stonith_peer_cs_destroy: Corosync connection terminated Nov 1 00:09:06 demorp2 cib[2720]: warning: qb_ipcs_event_sendv: new_event_notification (2720-2721-11): Broken pipe (32) Nov 1 00:09:06 demorp2 cib[2720]: warning: cib_notify_send_one: Notification of client crmd/4c1076bf-8a95-4f77-b866-e1bbf5e2ceda failed Nov 1 00:09:06 demorp2 cib[2720]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) Nov 1 00:09:06 demorp2 cib[2720]: error: cib_cs_destroy: Corosync connection lost! Exiting. Nov 1 00:09:06 demorp2 crmd[2725]: notice: crmd_exit: Forcing immediate exit: Link has been severed (67) Nov 1 00:09:06 demorp2 lrmd[2722]: warning: qb_ipcs_event_sendv: new_event_notification (2722-2725-6): Bad file descriptor (9) Nov 1 00:09:06 demorp2 lrmd[2722]: warning: send_client_notify: Notification of client crmd/3598d3e2-600a-4f15-aae2-e087437d6213 failed Nov 1 00:09:06 demorp2 lrmd[2722]: warning: send_client_notify: Notification of client crmd/3598d3e2-600a-4f15-aae2-e087437d6213 failed Nov 1 00:09:08 demorp2 kernel: dlm: closing connection to node 1 The other node It looks to me, like VMWare took too long to give this vm a time slice and corosync responded by killing one node ov 1 00:08:50 demorp1 lrmd[2433]: warning: child_timeout_callback: ybrpstat_monitor_5000 process (PID 32026) timed out Nov 1 00:08:50 demorp1 lrmd[2433]: warning: operation_finished: ybrpstat_monitor_5000:32026 - timed out after 20000ms Nov 1 00:08:51 demorp1 crmd[2436]: error: process_lrm_event: LRM operation ybrpstat_monitor_5000 (17) Timed Out (timeout=20000ms) Nov 1 00:08:52 demorp1 crmd[2436]: notice: process_lrm_event: demorp1-ybrpstat_monitor_5000:17 [ Service running for 18 hours 8 minutes 30 seconds.\n ] Nov 1 00:08:53 demorp1 lrmd[2433]: warning: child_timeout_callback: ybrpip_monitor_5000 process (PID 32033) timed out Nov 1 00:08:53 demorp1 lrmd[2433]: warning: operation_finished: ybrpip_monitor_5000:32033 - timed out after 20000ms Nov 1 00:08:53 demorp1 crmd[2436]: error: process_lrm_event: LRM operation ybrpip_monitor_5000 (22) Timed Out (timeout=20000ms) Nov 1 00:09:05 demorp1 corosync[1748]: [MAIN ] Corosync main process was not scheduled for 16241.7002 ms (threshold is 8000.0000 ms). Consider token timeout increase. Nov 1 00:09:05 demorp1 corosync[1748]: [TOTEM ] A processor failed, forming new configuration. Nov 1 00:09:05 demorp1 corosync[1748]: [TOTEM ] Process pause detected for 15555 ms, flushing membership messages. Nov 1 00:09:05 demorp1 corosync[1748]: [MAIN ] Corosync main process was not scheduled for 15555.0029 ms (threshold is 8000.0000 ms). Consider token timeout increase. Nov 1 00:09:05 demorp1 corosync[1748]: [CMAN ] quorum lost, blocking activity Nov 1 00:09:05 demorp1 corosync[1748]: [QUORUM] This node is within the non-primary component and will NOT provide any services. Nov 1 00:09:05 demorp1 corosync[1748]: [QUORUM] Members[1]: 1 Nov 1 00:09:05 demorp1 corosync[1748]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Nov 1 00:09:05 demorp1 corosync[1748]: [CMAN ] quorum regained, resuming activity Nov 1 00:09:05 demorp1 corosync[1748]: [QUORUM] This node is within the primary component and will provide service. Nov 1 00:09:05 demorp1 corosync[1748]: [QUORUM] Members[2]: 1 2 Nov 1 00:09:05 demorp1 corosync[1748]: [QUORUM] Members[2]: 1 2 Nov 1 00:09:05 demorp1 corosync[1748]: [CPG ] chosen downlist: sender r(0) ip(10.172.218.51) ; members(old:2 left:1) Nov 1 00:09:05 demorp1 corosync[1748]: [MAIN ] Completed service synchronization, ready to provide service. Nov 1 00:09:05 demorp1 crmd[2436]: notice: process_lrm_event: LRM operation ybrpip_monitor_5000 (call=22, rc=0, cib-update=17, confirmed=false) ok Nov 1 00:09:05 demorp1 crmd[2436]: notice: peer_update_callback: Our peer on the DC is dead Nov 1 00:09:05 demorp1 crmd[2436]: notice: cman_event_callback: Membership 320: quorum lost Nov 1 00:09:05 demorp1 crmd[2436]: notice: cman_event_callback: Membership 320: quorum acquired Nov 1 00:09:05 demorp1 crmd[2436]: notice: do_state_transition: State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION cause=C_CRMD_STATUS_CALLBACK origin=peer_update_callba ck ] Nov 1 00:09:05 demorp1 crmd[2436]: notice: process_lrm_event: LRM operation ybrpstat_monitor_5000 (call=17, rc=0, cib-update=18, confirmed=false) ok Nov 1 00:09:06 demorp1 crmd[2436]: warning: do_log: FSA: Input I_JOIN_OFFER from route_message() received in state S_ELECTION Nov 1 00:09:06 demorp1 crmd[2436]: notice: do_state_transition: State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_election_count_vote ] Nov 1 00:09:06 demorp1 fenced[1822]: telling cman to remove nodeid 2 from cluster Nov 1 00:09:06 demorp1 fenced[1822]: receive_start 2:3 add node with started_count 1 _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org