Hello,
I hope that someone can help me...
I have a two node pacemaker cluster, with to corosync rings.
Ubuntu 10.04, 64 bit. Pacemaker 1.0.8+hg15494-2ubuntu2, corosync 1.2.0-0ubuntu1.
One node (lxds05) disconnected, "crm status" marked it as offline. I searched
for reason and found a defect bbu on the raid controller. After 14 days, the
bbu was replaced.
But the node lxds05 does not continue to be a member. One corosync ring is
marked as faulty, the other as okay. "corosync-cfgtool -r" temporarily markt he
faulty ring as okay.
Any help is appreciated! I do not know, how to solve the problem.
Regards
Stefan
part of syslog:
May 16 10:01:17 lxds07 crmd: [1543]: info: handle_shutdown_request: Creating
shutdown request for lxds05 (state=S_IDLE)
May 16 10:01:17 lxds07 cib: [1539]: WARN: cib_peer_callback: Discarding
cib_modify message (3) from lxds05: not in our membership
crm_mon -rf:
============
Last updated: Fri May 16 14:28:58 2014
Stack: openais
Current DC: lxds07 - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
6 Resources configured.
============
Online: [ lxds07 ]
OFFLINE: [ lxds05 ]
Full list of resources:
Resource Group: group_omd
pri_fs_omd (ocf::heartbeat:Filesystem): Started lxds07
pri_apache2 (ocf::heartbeat:apache): Started lxds07
pri_nagiosIP (ocf::heartbeat:IPaddr2): Started lxds07
Master/Slave Set: ms_drbd_omd
Masters: [ lxds07 ]
Stopped: [ pri_drbd_omd:1 ]
Clone Set: clone_ping
Started: [ lxds07 ]
Stopped: [ pri_ping:1 ]
res_MailTo_omd_group (ocf::heartbeat:MailTo): Started lxds07
omd_itsc (ocf::omd:omdnagios): Started lxds07
res_MailTo_omd_itsc (ocf::heartbeat:MailTo): Started lxds07
Migration summary:
* Node lxds07: pingd=3000
omd_itsc: migration-threshold=1000000 fail-count=16 last-failure='Fri May 16
10:46:05 2014'
_______________________________________________
Openais mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/openais