Firstly thanks for your reply Lars. No luck with the latest version 2.1.2 also, I am having the same problem.
I am listing the config ha.cf : debug 1 logfile /var/log/ha-log keepalive 2 warntime 30 deadtime 80 initdead 90 node MASTERNODE node SLAVENODE bcast eth0 udpport 694 auto_failback on ping_group ping-cluster-test 10.10.10.1 10.10.10.151 respawn hacluster /usr/lib/heartbeat/ipfail crm off haresources : MASTERNODE 10.10.10.157 apache2 mon Hope you can find something very obvious Thanks & Regards Shailesh On Wed, 2007-08-01 at 12:50 +0200, Lars Daniel Forseth wrote: > 2.0.5 is quite old and has quite a few bugs! > > I suggest you try out the newest release (2.1.2) first and see if that > somewhat helps you. If that doesn't help you could ask again here > including your config files and a detailed description... :) > > > greets Lars. > > > Shailesh schrieb: > > Hi, > > I have not recieved any responses to my issue yet, I am of the > > feeling that the solution to this problem is straight-forward and that > > some experts on the list can giude me out of this. Hope you can oblige. > > > > Thanks & Regards > > Shailesh > > > > > > On Tue, 2007-07-31 at 12:05 +0530, Shailesh wrote: > >> Hello All, > >> I would appreciate if you could help me on this problem I > >> am facing with Apache HA with HB and MON. > >> > >> I have been working on setting up 2 node failover cluster for my > >> web service. I have installed the heartbeat 2.0.5 amd MON on the 2 SUSE > >> Linux servers. The MON is monitoring the Apache webserver. I tested two > >> methods of causing failover and then a failback. I end up having a > >> split brain in the cluster in Method 1. > >> > >> > >> Method 1: > >> > >> I find that SLAVENODE takes all the resource if I stop the heartbeat of > >> the MASTERNODE by running 'rcheartbeat stop', this is quite normal. > >> But If I do 'rcheartbeat start' on the MASTERNODE again to restart > >> heartbeat, the MASTERNODE thinks the SLAVENODE is dead and takes over > >> the resources ending up in a unrecoverable split-brain. > >> > >> Method 2: > >> Suprisingly, If I had caused the failover by pulling off the network > >> cable and the restored back the network cable followed by starting the > >> heartbeat again on the MASTERNODE, I see that MASTERNODE senses the > >> SLAVENODE, SLAVENODE relinquishes resources to MASTER and it seems > >> all fine. > >> > >> I am not able to get why the Method-1 of failover is ending up with > >> a split brain. > >> > >> My ha.cf and haresource are as below. > >> > >> debug 1 > >> logfile /var/log/ha-log > >> keepalive 2 > >> warntime 30 > >> deadtime 80 > >> initdead 90 > >> node MASTERNODE > >> node SLAVENODE > >> bcast eth0 > >> udpport 694 > >> auto_failback on > >> ping_group ping-cluster-test 10.10.10.1 10.10.10.151 > >> respawn hacluster /usr/lib/heartbeat/ipfail > >> crm off > >> > >> Also attached are the master and slave dump when split brain occurs in > >> Method-1. > >> > >> It would be great to get your solutios to this. > >> > >> > >> Regards > >> Shailesh P Shirali > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> _______________________________________________ > >> Linux-HA mailing list > >> [email protected] > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> See also: http://linux-ha.org/ReportingProblems > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
