Doug Lochart <dlochart <at> gmail.com> writes:
>
> I am in the testing stage of my 2 node HA cluster. I am running
> heartbeat 2.1.3_3 and DRBD 8.0.8. My highly available resources are
>
> 1 IP address
> sshd ( I have a secondary admin sshd process running on a different port)
> a custom java application
>
> We are also running rsync over ssh as in rsync -av --rsh="ssh ..."
>
> When a client is connected and rsyncing data I issue an hb_takeover
> from the secondary node. Everything swaps over to the new machine
> just fine. We rerun the client and we get a connection timeout
> message. Then I run hb_takeover from the new secondary node (initial
> primary) and again all resources swap over successfully. We try the
> client again and it works.
>
> We have a Watchguard Firewall between the client and the cluster.
> Behind the firewall I am able to ssh from the secondary node to the
> primary node on the internal ip address that is a resource. I have
> full connectivity between the machines on all ip addresses.
>
> I feel this is an ARP cache issue on the firewall.
>
> My question to the masses is this.
>
> Does/Can heartbeat do any upstream ARP management at its router?
> If not how can one programatically flush the ARP cache on a firewall
> from another machine? Is this possible?
>
> regards,
>
> Doug
>
Hi Doug,
we had the same problem. Between the heartbeat cluster and the clients there is
a gateway which do not receive broadcasts. So the gateway doesn't realise the
change when the virtual ips move to another host. The only way we found to
solve this problem is to send an arping directly to this switch.
I have added a function notify_switches into the script usr/lib/ocf/resource.d/
heartbeat/IPaddr:
notify_switches() {
if [ $OCF_RESKEY_nic != "" ]
then
INTERFACE=$OCF_RESKEY_nic
IP=$OCF_RESKEY_ip
# notify switches about IP<=>MAC change
# -f : quit on first reply
# -q : be quiet
# -c count : how many packets to send
# -w timeout : how long to wait for a reply
# -I device : which ethernet device to use (eth0)
# -s source : source ip address
# -U : Unsolicited ARP mode, update your neighbours
/sbin/arping -f -q -c 5 -w 5 -I $INTERFACE -s $IP -U <switch ip or name>
fi
}
This function is called in the function ip_start (at this time the interface
name is known). This works for our cluster very well.
Kind regards,
Christof
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems