On Thu, Jan 28, 2010 at 3:51 PM, Eric Blau <[email protected]> wrote: > Hi Linux HA list, > > I'm having this same problem that was reported previously with 2 servers > paired up that are not communicating with each other. Each shows the other > as offline in crm_mon. I'm running Linux HA 2.1.4 in CRM mode. I see these > messages in the log file: > > cib[20479]: 2010/01/28_08:58:25 info: write_cib_contents: Wrote version > 0.601.1 of the CIB to disk (digest: 22cd418a378a5ee22c1cc6347fa69817) > cib[18546]: 2010/01/28_08:58:25 WARN: cib_peer_callback: Discarding > cib_apply_diff message (732b9) from so1b: not in our membership > cib[18546]: 2010/01/28_08:58:25 WARN: cib_peer_callback: Discarding > cib_apply_diff message (732bb) from so1b: not in our membership > cib[20479]: 2010/01/28_08:58:25 info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.xml (digest: > /var/lib/heartbeat/crm/cib.xml.sig) > cib[20479]: 2010/01/28_08:58:25 info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest: > /var/lib/heartbeat/crm/cib.xml.sig.last) > cib[18546]: 2010/01/28_08:58:26 WARN: cib_peer_callback: Discarding > cib_apply_diff message (732c7) from so1b: not in our membership > > Each server appears to be rejecting the other from membership. They were > working fine and arbitrating an IPaddr2 resource before a split brain > occurred. After the split brain recovered, these errors started appearing. > I've verified with tcpdump that heartbeat connectivity is intact. > > Any ideas?
Basically, you need to get a recent version of Pacemaker. Heartbeat 2.1.4 is old enough to be hitting the bug I mentioned. > > Thanks in advance for any help! > > Regards, > Eric Blau > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Andrew Beekhof > Sent: Friday, 24 July, 2009 08:39 > To: General Linux-HA mailing list > Subject: Re: [Linux-HA] Node ha2 is not sync with node ha1 > > What version are you using? > There was a bug like this but it was fixed a long time ago > > On Wed, Jul 22, 2009 at 10:02 AM, Ahmed Munir<[email protected]> > wrote: >> Hi all, >> Hoping you all fine. I've got 2 machines and I've installed Linux HA and >> OpenSIPs on them and configured them as an active-active scenario. Machine > 1 >> named ha1, is assigned with virtual IP 192.168.0.184 and machine 2 named >> ha2, is assigned with virtual IP 192.168.0.185. >> >> The integration between HA and OpenSIPs is working fine. Like if I stop > the >> service of HA, machine ha1 comes down, its resources are taken by machine >> ha2 and when ha1 comes online, ha1 take its resources back from machine > ha2 >> and vice versa. >> >> If I turn off ha1 machine its resources are taken by machine ha2 and >> when ha1 comes online, ha1 take its resources back from machine ha2 which > is >> working fine. But when I turn off ha2 machine its resources are taken by >> machine ha1 and when ha2 comes online, and I check the status of ha2 using >> crm_mon command, >> it shows me weird status as I'm listing down below; >> >> On ha1 machine; >> >> Node: ha1 (e651c120-b9a1-489a-baf7-caf0028ad540): online >> Node: ha2 (70503c2e-bb4a-48f8-aab3-53696656a4d0): offline >> >> IPaddr_1 (heartbeat::ocf:IPaddr): Started ha1 >> IPaddr_2 (heartbeat::ocf:IPaddr): Started ha1 >> OpenSips_1 (heartbeat::ocf:OpenSips): Started ha1 >> OpenSips_2 (heartbeat::ocf:OpenSips): Started ha1 >> >> On ha2 machine; >> >> Node: ha1 (e651c120-b9a1-489a-baf7-caf0028ad540): offline >> Node: ha2 (70503c2e-bb4a-48f8-aab3-53696656a4d0): online >> >> IPaddr_1 (heartbeat::ocf:IPaddr): Started ha2 >> IPaddr_2 (heartbeat::ocf:IPaddr): Started ha2 >> OpenSips_1 (heartbeat::ocf:OpenSips): Started ha2 >> OpenSips_2 (heartbeat::ocf:OpenSips): Started ha2 >> >> Or sometimes on ha2 machine; >> >> Node: ha1 (e651c120-b9a1-489a-baf7-caf0028ad540): online >> Node: ha2 (70503c2e-bb4a-48f8-aab3-53696656a4d0): offline >> >> IPaddr_1 (heartbeat::ocf:IPaddr): Started ha1 >> IPaddr_2 (heartbeat::ocf:IPaddr): Started ha1 >> OpenSips_1 (heartbeat::ocf:OpenSips): Started ha1 >> OpenSips_2 (heartbeat::ocf:OpenSips): Started ha1 >> >> After that I've checked logs and I'm getting these errors as listed below; >> >> Jul 22 14:12:06 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding >> cib_apply_diff message (3a9) from ha2: not in our membership >> Jul 22 14:12:06 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding >> cib_apply_diff message (3aa) from ha2: not in our membership >> Jul 22 14:12:06 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding >> cib_apply_diff message (3ab) from ha2: not in our membership >> Jul 22 14:12:06 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding >> cib_apply_diff message (3ac) from ha2: not in our membership >> Jul 22 14:12:06 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding >> cib_apply_diff message (3ad) from ha2: not in our membership >> Jul 22 14:12:07 ha1 cib: [9978]: WARN: cib_peer_callback: Discarding >> cib_apply_diff message (3b0) from ha2: not in our membership >> Jul 22 14:12:07 ha1 ccm: [9977]: ERROR: llm_set_uptime: Negative uptime >> -1778384896 for node 0 [ha1] >> Jul 22 14:12:07 ha1 ccm: [9977]: ERROR: llm_set_uptime: Negative uptime >> -1879048192 for node 1 [ha2] >> >> Even I've configured same settings on both machines but I don't know why >> I'm getting these errors. >> >> Further added I'm attaching cib.xml, OpenSips (which I created resource > file >> for OpenSIPs), ha.cf and log files. Kindly do have a look and update >> me ASAP. >> >> >> -- >> Regards, >> >> Ahmed Munir >> >> _______________________________________________ >> Linux-HA mailing list >> [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
