Hi,

On Fri, Mar 14, 2008 at 08:04:43AM +0000, Christof Wiltschek wrote:
> Doug Lochart <dlochart <at> gmail.com> writes:
> 
> > 
> > I am in the testing stage of my 2 node HA cluster.  I am running
> > heartbeat 2.1.3_3 and DRBD 8.0.8.  My highly available resources are
> > 
> > 1 IP address
> > sshd ( I have a secondary admin sshd process running on a different port)
> > a custom java application
> > 
> > We are also running rsync over ssh as in rsync -av --rsh="ssh ..."
> > 
> > When a client is connected and rsyncing data I issue an hb_takeover
> > from the secondary node.  Everything swaps over to the new machine
> > just fine.  We rerun the client and we get a connection timeout
> > message.  Then I run hb_takeover from the new secondary node (initial
> > primary) and again all resources swap over successfully. We try the
> > client again and it works.
> > 
> > We have a Watchguard Firewall between the client and the cluster.
> > Behind the firewall I am able to ssh from the secondary node to the
> > primary node on the internal ip address that is a resource.  I have
> > full connectivity between the machines on all ip addresses.
> > 
> > I feel this is an ARP cache issue on the firewall.
> > 
> > My question to the masses is this.
> > 
> > Does/Can heartbeat do any upstream ARP management at its router?
> > If not how can one programatically flush the ARP cache on a firewall
> > from another machine?  Is this possible?
> > 
> > regards,
> > 
> > Doug
> > 
> 
> 
> Hi Doug,
> 
> we had the same problem. Between the heartbeat cluster and the clients there 
> is 
> a gateway which do not receive broadcasts. So the gateway doesn't realise the 
> change when the virtual ips move to another host. The only way we found to 
> solve this problem is to send an arping directly to this switch.
> 
> I have added a function notify_switches into the script 
> usr/lib/ocf/resource.d/
> heartbeat/IPaddr:

Doug: Has this helped you?

Perhaps then this should be added to IPaddr. We could add an
attribute to contain a list of network devices which need
explicit arp cache updates. Anybody have an opinion on this? I
can recall that the issue of arp caches would come up sometimes.

Cheers,

Dejan

> 
> notify_switches() {
>   if [ $OCF_RESKEY_nic != "" ]
>   then
>     INTERFACE=$OCF_RESKEY_nic
>     IP=$OCF_RESKEY_ip
>     # notify switches about IP<=>MAC change
>     # -f : quit on first reply
>     # -q : be quiet
>     # -c count : how many packets to send
>     # -w timeout : how long to wait for a reply
>     # -I device : which ethernet device to use (eth0)
>     # -s source : source ip address
>     # -U : Unsolicited ARP mode, update your neighbours
>     /sbin/arping -f -q -c 5 -w 5 -I $INTERFACE -s $IP -U <switch ip or name>
>   fi
> }
> 
> This function is called in the function ip_start (at this time the interface 
> name is known). This works for our cluster very well.
> 
> Kind regards,
> Christof
> 
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

-- 
Dejan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to