On 7/14/10 10:06 AM, Matthias Müller wrote:
> Hi,
> 
> On Wed, 14 Jul 2010 11:16:45 -0400 (EDT)
>> The vlan in question had an arp timeout of 60s and had a couple of KVM 
>> servers with 100 or so virtual machines.  Especially when a large number 
>> of VMs started up, we'd see periods of packet loss.  My assumption is that 
>> the sup720-3bxl can only handle so much arp activity on a vlan.  There was 
>> nothing in the config to artificially limit arp traffic.
> 
> We've seen this issue here, too. The problem at our site was the synchronized 
> arp request from some loadbalancers (synchronizing over time). Maybe that's 
> the case here, too, that your virtual machines send out arp-requests for the 
> default gateway during the same time and then the SUP misses some of these 
> arp requests, has high cpu load and results in short traffic interruption for 
> some hosts.
> Syncronizing of stuff like that funnily happens often in network environments 
> after some time if no precautions are taken and it can result in surprising 
> effects.
> 
> Matthias


And here as well.  Any high traffic many switch network with HSRP fed
subnets with high BW users is prone.


I try to keep device count per subnet sane, use a 240 second arp timeout
on the HSRP speaking routers, and for high traffic setups tent to favor
pairs of routing switches (L2 domain limited to server cluster and 2
switches) acting as redundant routing.


_______________________________________________
cisco-nsp mailing list  [email protected]
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

Reply via email to