http://en.wikipedia.org/wiki/5-4-3_rule
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Kurt Buff Sent: Friday, September 20, 2013 12:59 PM To: [email protected] Subject: [NTSysADM] Semi-OT: Network problem All, In the past couple of weeks, $work has had a problem with network interruptions - frequent gaps in network connectivity were all contact is lost with servers for brief periods of time (1-2 minutes, usually). I could see the gaps in the graphs on my (very new and incomplete - long story, don't ask) cacti installation. Unfortunately, I've been unable to get cacti to graph CPU utilization for the switches, because they're Procurves, and I couldn't find a working XML file or configuration for that. It's always happened while I've been unavailable, until today. Just now, I was able to show conclusively that our core layer3 switch (Procurve 3400cl-48G), which was hit hardest, spikes its CPU to 99% during these episodes. Volume of traffic is normal - ho huge spikes in that, just normal variation, AFAICT, from the cacti graphs. I haven't had time to see if other switches also spike their CPU, but given the gaps in the graphs, I suspect that's the case. I suspect someone is doing something stupid to create layer2 loop, as we have lots of little 5 and 8 port switches on desktops and in our engineering lab - and in spite of the fact that I've set our core switch as the root of the spanning tree. I'm setting up a box to do a tcpdump in a ring buffer with smallish files so that I can do analysis on them more easily. I'm not a packet analysis guy, though I've done some looking on occasion. Anyone have thoughts on what to look for when I start my analysis? Kurt

