Re: [Linux-HA] split brain on auto- failback

Shailesh Wed, 01 Aug 2007 22:07:42 -0700

Firstly thanks for your reply Lars.

No luck with the latest version 2.1.2 also, I am having the same
problem.


I am listing the config 

ha.cf :
  debug 1
  logfile /var/log/ha-log
  keepalive 2
  warntime 30
  deadtime 80
  initdead 90
  node MASTERNODE
  node SLAVENODE
  bcast eth0
  udpport 694
  auto_failback on
  ping_group ping-cluster-test 10.10.10.1 10.10.10.151
  respawn hacluster /usr/lib/heartbeat/ipfail
  crm off


haresources :

MASTERNODE 10.10.10.157 apache2 mon



Hope you can find something very obvious

Thanks & Regards
Shailesh

On Wed, 2007-08-01 at 12:50 +0200, Lars Daniel Forseth wrote:
> 2.0.5 is quite old and has quite a few bugs!
> 
> I suggest you try out the newest release (2.1.2) first and see if that 
> somewhat helps you. If that doesn't help you could ask again here 
> including your config files and a detailed description... :)
> 
> 
> greets Lars.
> 
> 
> Shailesh schrieb:
> > Hi,
> >      I have not recieved any responses to my issue yet, I am of the
> > feeling  that the solution to this problem is straight-forward and that
> > some experts on the list can giude me out of this. Hope you can oblige.
> > 
> > Thanks & Regards
> > Shailesh 
> > 
> > 
> > On Tue, 2007-07-31 at 12:05 +0530, Shailesh wrote:
> >> Hello All,
> >>         I would appreciate if you could help me on this problem I  
> >> am facing with Apache HA with HB and MON.
> >>
> >>         I have been working on setting up 2 node failover cluster for my
> >> web service. I have installed the heartbeat 2.0.5 amd MON on the 2 SUSE
> >> Linux servers. The MON is monitoring the Apache webserver. I tested two
> >> methods  of causing failover and then a failback. I end up having a
> >> split brain in the cluster in Method 1.
> >>
> >>
> >> Method 1:
> >>
> >> I find that SLAVENODE takes all the resource if I stop the heartbeat of
> >> the MASTERNODE by running 'rcheartbeat stop', this is quite normal.
> >> But If I do 'rcheartbeat start' on the MASTERNODE again to restart
> >> heartbeat, the MASTERNODE thinks the SLAVENODE is dead and takes over
> >> the resources ending up in a unrecoverable split-brain. 
> >>
> >> Method 2:
> >> Suprisingly, If I had caused the failover by pulling off the network
> >> cable and the restored back the network cable followed by starting the
> >> heartbeat again on the MASTERNODE,  I see that MASTERNODE senses the
> >> SLAVENODE, SLAVENODE relinquishes resources to MASTER and it seems 
> >> all fine.
> >>
> >> I am not able to get why the Method-1 of failover is ending up with
> >> a split brain.
> >>
> >> My ha.cf and haresource are as below. 
> >>
> >> debug 1
> >> logfile /var/log/ha-log
> >> keepalive 2
> >> warntime 30
> >> deadtime 80
> >> initdead 90
> >> node MASTERNODE
> >> node SLAVENODE
> >> bcast eth0
> >> udpport 694
> >> auto_failback on
> >> ping_group ping-cluster-test 10.10.10.1 10.10.10.151
> >> respawn hacluster /usr/lib/heartbeat/ipfail
> >> crm off
> >>
> >> Also attached are the master and slave dump when split brain occurs in
> >> Method-1.
> >>
> >> It would be great to get your solutios to this.
> >>
> >>
> >> Regards
> >> Shailesh P Shirali
> >>
> >>
> >>
> >>
> >>
> >>
> >>  
> >>
> >>
> >> _______________________________________________
> >> Linux-HA mailing list
> >> [email protected]
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> See also: http://linux-ha.org/ReportingProblems
> > 
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> 
> 

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] split brain on auto- failback

Reply via email to