RE: [WIRELESS-LAN] Theories on a massive problem on our WLAN?

2006-03-13 Thread Scholz, Greg
Not including crushing the regular Ethernet switches (3500's) I have
seen Cisco APs do some thing similar to what you are speculating.  This
was pre-IOS and Cisco confirmed what we saw but never fixed the issue
directly in the VxWorks because they claimed the IOS version would not
have it.  I left that job before we did this so I do not know the
result.

So with all that caveat, here is what we were 99.999% sure was
happening.  Each Cisco AP maintains a list of associations. The list
of associations includes clients as well as ALL APs in the same
broadcast domain.  I believe it has something to do with handoffs and
such or maybe just informational traffic, I am not sure.  In any case we
had 331 APs but only a small handful of clients.  The APs were getting
creamed by trying to keep track of 330 of their buddies as well as their
buddies client associations.  That was the major flaw.  Cisco said if we
HAD to we should split up the management vlan so there were not 331 in
the same broadcast domain but leave the client vlans alone.  We did this
as a short term fix.

To compound this, someone (not me) told us at that time Wavelink was the
only way to go for management.  We went with it to find the following
problem.  One of the things it did was to periodically (5-15 mins or so)
poll each access point to include it's association table.  Well, here
you go with 330 entries from each and every of the 330 APs in addition
to the APs config itself.

Needless to say both of these issues caused a bit of what I would say
was excessive management traffic.

I can not remember the protocol name but if you do a sniff where you can
see layer 2 management traffic between the APs it should be pretty
obvious. I would look to see if WLSE is doing some sort of unexpected
query of the APs that may cause a larger than reasonable response.

Hope this gets you somewhere.

_
Thank you,
Gregory R. Scholz
Lead Network Engineer
Information Technology Group
Keene State College
(603)358-2070

--Seek first to understand, and then to be understood. 
  (Steven Covey)


-Original Message-
From: Lee Badman [mailto:[EMAIL PROTECTED] 
Sent: Monday, March 13, 2006 12:48 PM
To: WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU
Subject: [WIRELESS-LAN] Theories on a massive problem on our WLAN?

Wondering if anyone in the group cares to hazard a theory.

Our Cisco WLAN has been quite stable for better than three years.
Currently running *180* 1130s, *120* 1200s, and a couple dozen 350s-
mostly IOS but a couple of legacy VxWorks that are hard to get to to
convert. We have the clasic DMZ/Captive portal thing going on, where a
home-built gateway head-ends each of our two major wireless spaces, with
an optional VPN box for each space. We do trunk specific VLANs around
for each space. WLSE manages it all, no WLSM, no forced client
encryption (other than voluntary VPN). IOS APs are current and all
within 2 minor revisions of each other, and have been cruising along
nicely for quite a while.

This past Saturday, very early in the morning, one of our wireless
spaces was creamed by some sort of broad-ranging, severe multicast
flood. Long story short- it seemed like the APs were chattering back and
forth to each other with huge, continuous, multicast streams that
overwhelmed many of the switches carrying the traffic. Once it started,
it seemed to be self-propogating. We had to put in some ACLs to break
things up, and in some cases reboot the switches. Cat 3500s seem to take
the worst of it, and a couple got corrupted to the point of becoming
doorstops.

Knowing that it's hard to see the whole picture from afar, wondering if
anyone has ever experienced anything like this? 

Thanks for playing the game.

Lee

Lee Badman
Network Engineer
CWNA, CWSP
Information Technology and Services
(Formerly Computing and Media Services)
Syracuse University
(315) 443-3003
[EMAIL PROTECTED]

**
Participation and subscription information for this EDUCAUSE Constituent
Group discussion list can be found at http://www.educause.edu/groups/.

**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/groups/.


Re: [WIRELESS-LAN] Theories on a massive problem on our WLAN?

2006-03-13 Thread Dale W. Carder
Thus spake Lee Badman ([EMAIL PROTECTED]) on Mon, Mar 13, 2006 at 12:47:51PM 
-0500:
 Wondering if anyone in the group cares to hazard a theory.

Sure, as long as you don't hold me to it!

Sounds like IAPP is freaking out.  I've heard rumors of this.
I think for example, you can get an IAPP storm by putting a 
  loopback interface on an IOS ap in heavyweight mode.

In general, cisco ap's aren't known for scaling to high numbers
of AP's on the same subnet due to this sort of chatter between
them and the amount of state info they all think they need to
carry.  So, I would look hard at splitting up into a lot more
layer 3.

See if sh iapp statistics as any clues.

Did you get a sniffer trace?  I think ethereal can decode IAPP.

Make sure you filter IAPP with ACL's where needed, too.  

Can you enable multicast storm control on the 3500 platform?

I wouldn't exactly expect this to get fixed either since Cisco
is basicly throwing away everything and trying again with the
company they bought to replace aironet.  IMHO, of course.

Dale


Dale W. Carder - Network Engineer   | DoIT Network Services
University of Wisconsin at Madison  | [EMAIL PROTECTED] 
(608) 263-3628 | 24hr NOC: 263-4188 | http://net.doit.wisc.edu/~dwcarder

**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/groups/.