Re: [Linux-HA] heartbeat step down after split brain scenario
Hi - thanks for the response. Dimitri Maziuk wrote: What do you mean by disconnecting: what's your failure scenario and how do you expect it to be handled? The disconnection is the loss of the intersite link which interrupts heartbeat comms. In this case it's expected that both sites will acquire the resources and become active. However, what I want to happen is that one of the sites will give up the resources again when it sees that the other site is up again. Dimitri Maziuk wrote: Running daemons are not guaranteed (arguably, expected) to notice when the network cable is unplugged. You have to monitor the link and restart all processes that bind()/listen() on the interface. If your nodes are at different sites, you need to also deal with the loss of link at the switch, gateway, etc., and figure out which one is still connected to the Internet -- and gets to keep the VIP. Which in general can't be done from the nodes themselves. Yes - in this case neither site has to be connected to the internet, this is more an internal load balancing act between two connected sites in a customers network. What I found is that by setting auto_failback on in ha.cf at both sites the site/node listed in haresources will keep the resources when the link is re-established and the other site will release the resources. This is the result I was looking for. Regards Jack -- View this message in context: http://old.nabble.com/heartbeat-step-down-after-split-brain-scenario-tp31858728p31884521.html Sent from the Linux-HA mailing list archive at Nabble.com. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] heartbeat step down after split brain scenario
I have a two node cluster using heartbeat and haproxy. Unfortunately it is impossible to provide redundant heartbeat paths between the two nodes at different sites so it is possible for a failure to cause split brain. To evaluate the impact I tried disconnecting the two nodes and I found that both become active and both try to keep the VIPs after the link is restored. Is this avoidable using the auto_failback option? -- View this message in context: http://old.nabble.com/heartbeat-step-down-after-split-brain-scenario-tp31858728p31858728.html Sent from the Linux-HA mailing list archive at Nabble.com. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] heartbeat step down after split brain scenario
On 06/16/2011 04:28 AM, Jack Berg wrote: I have a two node cluster using heartbeat and haproxy. Unfortunately it is impossible to provide redundant heartbeat paths between the two nodes at different sites so it is possible for a failure to cause split brain. To evaluate the impact I tried disconnecting the two nodes and I found that both become active and both try to keep the VIPs after the link is restored. What do you mean by disconnecting: what's your failure scenario and how do you expect it to be handled? Running daemons are not guaranteed (arguably, expected) to notice when the network cable is unplugged. You have to monitor the link and restart all processes that bind()/listen() on the interface. If your nodes are at different sites, you need to also deal with the loss of link at the switch, gateway, etc., and figure out which one is still connected to the Internet -- and gets to keep the VIP. Which in general can't be done from the nodes themselves. Dima -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems