Hi,
On Fri, Mar 14, 2008 at 08:04:43AM +0000, Christof Wiltschek wrote:
> Doug Lochart <dlochart <at> gmail.com> writes:
>
> >
> > I am in the testing stage of my 2 node HA cluster. I am running
> > heartbeat 2.1.3_3 and DRBD 8.0.8. My highly available resources are
> >
> > 1 IP address
> > sshd ( I have a secondary admin sshd process running on a different port)
> > a custom java application
> >
> > We are also running rsync over ssh as in rsync -av --rsh="ssh ..."
> >
> > When a client is connected and rsyncing data I issue an hb_takeover
> > from the secondary node. Everything swaps over to the new machine
> > just fine. We rerun the client and we get a connection timeout
> > message. Then I run hb_takeover from the new secondary node (initial
> > primary) and again all resources swap over successfully. We try the
> > client again and it works.
> >
> > We have a Watchguard Firewall between the client and the cluster.
> > Behind the firewall I am able to ssh from the secondary node to the
> > primary node on the internal ip address that is a resource. I have
> > full connectivity between the machines on all ip addresses.
> >
> > I feel this is an ARP cache issue on the firewall.
> >
> > My question to the masses is this.
> >
> > Does/Can heartbeat do any upstream ARP management at its router?
> > If not how can one programatically flush the ARP cache on a firewall
> > from another machine? Is this possible?
> >
> > regards,
> >
> > Doug
> >
>
>
> Hi Doug,
>
> we had the same problem. Between the heartbeat cluster and the clients there
> is
> a gateway which do not receive broadcasts. So the gateway doesn't realise the
> change when the virtual ips move to another host. The only way we found to
> solve this problem is to send an arping directly to this switch.
>
> I have added a function notify_switches into the script
> usr/lib/ocf/resource.d/
> heartbeat/IPaddr:
Doug: Has this helped you?
Perhaps then this should be added to IPaddr. We could add an
attribute to contain a list of network devices which need
explicit arp cache updates. Anybody have an opinion on this? I
can recall that the issue of arp caches would come up sometimes.
Cheers,
Dejan
>
> notify_switches() {
> if [ $OCF_RESKEY_nic != "" ]
> then
> INTERFACE=$OCF_RESKEY_nic
> IP=$OCF_RESKEY_ip
> # notify switches about IP<=>MAC change
> # -f : quit on first reply
> # -q : be quiet
> # -c count : how many packets to send
> # -w timeout : how long to wait for a reply
> # -I device : which ethernet device to use (eth0)
> # -s source : source ip address
> # -U : Unsolicited ARP mode, update your neighbours
> /sbin/arping -f -q -c 5 -w 5 -I $INTERFACE -s $IP -U <switch ip or name>
> fi
> }
>
> This function is called in the function ip_start (at this time the interface
> name is known). This works for our cluster very well.
>
> Kind regards,
> Christof
>
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
--
Dejan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems