Hiya, all. I have 3 HNs running a total of some 30 VEs. There are three specific VEs which have a problem which seems ARP-related. Any ideas on it would be much appreciated.

Symptom:
Every now and then, sometimes every minute and sometimes not for a whole hour, the VE becomes inaccessible via IP for some 10-30 seconds. During this time, other hosts in the network also can't ping the host. The "arp" command shows (incomplete) for the host's entry.

Temporary fix:
Using arpsend on the HN "/usr/sbin/arpsend -U -i $ip -c1 bond0" fixes this issue until the client expires its ARP entry. I have a cronjob to run this every minute, but even that isn't enough.

Other IP and routing info:
* There are 5 IP blocks, /27 and /28 in size. IPs from all blocks are arbitrarily distributed around the machines. * We have 2 GigE switches. Each HN has dual GigE NICs, and uses Linux bonding. The 2 NICs go to the 2 switches, for fault tolerance. * At the border we have a router which we don't control. Traffic between IP blocks, even if destined for the local network, is double-transited.

The problem only affects these three and none of the others, and its affecting only these specific three has been retained even as VEs are moved between HNs. As such, I have ruled out bad cables or switch ports, overloading of the hardware, system load, and differences in the HN's OS and sysctl params. All VEs are created using the same script; the only diffs in their VZ config would be auto-generated MACs and veths.

I've been over these pages, and couldn't find information to help:
http://wiki.openvz.org/Multiple_network_interfaces_and_ARP_flux
http://wiki.openvz.org/Virtual_Ethernet_device.

Any thoughts on troubleshooting this? Any further information I should provide?

--
Gregor Mosheh / Greg Allensworth, BS, A+
System Administrator
HostGIS cartographic development & hosting services
http://www.HostGIS.com/

"Remember that no one cares if you can back up,
 only if you can restore." - AMANDA
_______________________________________________
Users mailing list
[email protected]
https://openvz.org/mailman/listinfo/users

Reply via email to