On 7/14/10 10:06 AM, Matthias Müller wrote: > Hi, > > On Wed, 14 Jul 2010 11:16:45 -0400 (EDT) >> The vlan in question had an arp timeout of 60s and had a couple of KVM >> servers with 100 or so virtual machines. Especially when a large number >> of VMs started up, we'd see periods of packet loss. My assumption is that >> the sup720-3bxl can only handle so much arp activity on a vlan. There was >> nothing in the config to artificially limit arp traffic. > > We've seen this issue here, too. The problem at our site was the synchronized > arp request from some loadbalancers (synchronizing over time). Maybe that's > the case here, too, that your virtual machines send out arp-requests for the > default gateway during the same time and then the SUP misses some of these > arp requests, has high cpu load and results in short traffic interruption for > some hosts. > Syncronizing of stuff like that funnily happens often in network environments > after some time if no precautions are taken and it can result in surprising > effects. > > Matthias
And here as well. Any high traffic many switch network with HSRP fed subnets with high BW users is prone. I try to keep device count per subnet sane, use a 240 second arp timeout on the HSRP speaking routers, and for high traffic setups tent to favor pairs of routing switches (L2 domain limited to server cluster and 2 switches) acting as redundant routing. _______________________________________________ cisco-nsp mailing list [email protected] https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
