On 16 May 2014, at 11:11 pm, Senftleben, Stefan (itsc) <[email protected]> wrote:
> Hello, > > I hope that someone can help me… > I have a two node pacemaker cluster, with to corosync rings. > Ubuntu 10.04, 64 bit. Pacemaker 1.0.8+hg15494-2ubuntu2, corosync > 1.2.0-0ubuntu1. It _could_ be a pacemaker issue, but 1.0.8 is over 4 years old and I have no idea what additional changes went into hg15494. So unfortunately your options are upgrade to something a little more recent that upstream can help you with, or see if you can get some support for that version from ubuntu. Have you tried simply stopping the cluster on both nodes before starting it again? That has been known to help on occasion. To do so without stopping the resources managed by the cluster you could draw inspiration from: http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_disconnect_and_reattach.html > One node (lxds05) disconnected, „crm status“ marked it as offline. I searched > for reason and found a defect bbu on the raid controller. After 14 days, the > bbu was replaced. > But the node lxds05 does not continue to be a member. One corosync ring is > marked as faulty, the other as okay. „corosync-cfgtool –r“ temporarily markt > he faulty ring as okay. > > Any help is appreciated! I do not know, how to solve the problem. > > Regards > Stefan > > > part of syslog: > May 16 10:01:17 lxds07 crmd: [1543]: info: handle_shutdown_request: Creating > shutdown request for lxds05 (state=S_IDLE) > May 16 10:01:17 lxds07 cib: [1539]: WARN: cib_peer_callback: Discarding > cib_modify message (3) from lxds05: not in our membership > > crm_mon –rf: > ============ > Last updated: Fri May 16 14:28:58 2014 > Stack: openais > Current DC: lxds07 - partition with quorum > Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd > 2 Nodes configured, 2 expected votes > 6 Resources configured. > ============ > > Online: [ lxds07 ] > OFFLINE: [ lxds05 ] > > Full list of resources: > > Resource Group: group_omd > pri_fs_omd (ocf::heartbeat:Filesystem): Started lxds07 > pri_apache2 (ocf::heartbeat:apache): Started lxds07 > pri_nagiosIP (ocf::heartbeat:IPaddr2): Started lxds07 > Master/Slave Set: ms_drbd_omd > Masters: [ lxds07 ] > Stopped: [ pri_drbd_omd:1 ] > Clone Set: clone_ping > Started: [ lxds07 ] > Stopped: [ pri_ping:1 ] > res_MailTo_omd_group (ocf::heartbeat:MailTo): Started lxds07 > omd_itsc (ocf::omd:omdnagios): Started lxds07 > res_MailTo_omd_itsc (ocf::heartbeat:MailTo): Started lxds07 > > Migration summary: > * Node lxds07: pingd=3000 > omd_itsc: migration-threshold=1000000 fail-count=16 last-failure='Fri May > 16 10:46:05 2014' > > > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linuxfoundation.org/mailman/listinfo/openais
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Openais mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/openais
