Re: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

2017-09-11 Thread Mark Duling
Thanks for all the replies everyone. Well I'm not used to looking at AP
logs, but I just logged into one AP on the list and on the day it happened
all I see are some of these:

%DOT11-4-CCMP_REPLAY  AES-CCMP TSC replays

and two of these for a client:

%DOT11-4-FLUSH_DEAUTH: Consecutive tx fail 500+: deauth

I'm not used to looking at AP logs, but I would think if the AP thought it
disassociated it would say so. Another one on the list shows nothing
corresponding to the time (if I've translated the time properly) but the
radio interface was reset during the day.

*Sep  8 00:45:20.799: %LINK-6-UPDOWN: Interface Dot11Radio1, changed state
to down
*Sep  8 00:45:20.803: %LINK-5-CHANGED: Interface Dot11Radio1, changed state
to reset
*Sep  8 00:45:21.807: %LINEPROTO-5-UPDOWN: Line protocol on Interface
Dot11Radio1, changed state to down
*Sep  8 00:45:21.831: %DOT11-6-DFS_SCAN_START: DFS: Scanning frequency 5500
MHz for 60 seconds.
*Sep  8 00:45:21.835: %LINK-6-UPDOWN: Interface Dot11Radio1, changed state
to up
*Sep  8 00:45:22.835: %LINEPROTO-5-UPDOWN: Line protocol on Interface
Dot11Radio1, changed state to up
*Sep  8 00:45:35.347: %CLEANAIR-6-STATE: Slot 1 down
*Sep  8 00:45:52.167: %CLEANAIR-6-STATE: Slot 1 enabled
*Sep  8 00:46:21.947: %DOT11-6-DFS_SCAN_COMPLETE: DFS scan complete on
frequency 5500 MHz
*Sep  8 01:28:39.379: %DOT11-4-CCMP_REPLAY: Client [redacted] had 1
AES-CCMP TSC replays
*Sep  8 02:03:10.883: %DOT11-4-CCMP_REPLAY: Client [redacted] had 1
AES-CCMP TSC replays
*Sep  8 21:44:55.403: %DOT11-4-CCMP_REPLAY: Client [redacted] had 46
AES-CCMP TSC replays

Not sure what to make of the logs.

On Mon, Sep 11, 2017 at 2:04 PM, Jeffrey D. Sessler  wrote:

> Did you go back and correlate the event? For example, SSH into a few of
> the WAP’s and look at their logs to see what they thought happened. Did the
> CAPWAP uptime actually change on their WAPs qne/or the hours they report
> being connected. The WAP logs tend to be very informative.
>
>
>
> If you use DHCP to hand out IPs for the WAPs, did you have a look at your
> DHCP logs? Many years ago, I saw something similar and it turned out to be
> the DHCP server – a mass of WAPs went to renew at the same time, DHCP
> server couldn’t take the load, and failing the renewal, a mass of WAPs
> disassociated/associated.
>
>
>
> Jeff
>
>
>
> *From: *"wireless-lan@listserv.educause.edu"  EDUCAUSE.EDU> on behalf of "mark.dul...@biola.edu" 
> *Reply-To: *"wireless-lan@listserv.educause.edu"  EDUCAUSE.EDU>
> *Date: *Monday, September 11, 2017 at 11:48 AM
> *To: *"wireless-lan@listserv.educause.edu"  EDUCAUSE.EDU>
> *Subject: *[WIRELESS-LAN] spurious cpi report of mass AP disassociation
>
>
>
> We're using Cisco 8540 on code 8.2.151.0. Last week CPI reported a great
> number of simultaneous AP disassociations and then reassociation. CPI shows
> all the events had the exact same timestamp right down to the hundredth
> second. It was just a single event.
>
>
>
> But I can find no event preceding it that would cause such a thing. No
> preceding controller errors that I can see. At least a hundred APs were on
> the list. The APs weren't the same type or in the same buildings. I can
> find no common thing at all about it.
>
>
>
> No one called in to report any issues. I would think if they really did
> drop those on an affected AP would have noticed. Only one AP in the
> building housing IT was on the report, so perhaps not surprising that none
> of us noticed anything.
>
>
>
> Has anyone out there seen anything like this? Aside from the unknown
> cause, is it possible for disassocation and reassociation happened fast
> enough that users wouldn't see any serious disruption if only doing
> stateless stuff? I'd have trouble believing the controller would report AP
> drops that didn't happen.
>
> ** Participation and subscription information for this EDUCAUSE
> Constituent Group discussion list can be found at http://www.educause.edu/
> discuss.
> ** Participation and subscription information for this EDUCAUSE
> Constituent Group discussion list can be found at http://www.educause.edu/
> discuss.
>
>

**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/discuss.



Re: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

2017-09-11 Thread Jeffrey D. Sessler
Did you go back and correlate the event? For example, SSH into a few of the 
WAP’s and look at their logs to see what they thought happened. Did the CAPWAP 
uptime actually change on their WAPs qne/or the hours they report being 
connected. The WAP logs tend to be very informative.

If you use DHCP to hand out IPs for the WAPs, did you have a look at your DHCP 
logs? Many years ago, I saw something similar and it turned out to be the DHCP 
server – a mass of WAPs went to renew at the same time, DHCP server couldn’t 
take the load, and failing the renewal, a mass of WAPs disassociated/associated.

Jeff

From: "wireless-lan@listserv.educause.edu"  
on behalf of "mark.dul...@biola.edu" 
Reply-To: "wireless-lan@listserv.educause.edu" 

Date: Monday, September 11, 2017 at 11:48 AM
To: "wireless-lan@listserv.educause.edu" 
Subject: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

We're using Cisco 8540 on code 8.2.151.0. Last week CPI reported a great number 
of simultaneous AP disassociations and then reassociation. CPI shows all the 
events had the exact same timestamp right down to the hundredth second. It was 
just a single event.

But I can find no event preceding it that would cause such a thing. No 
preceding controller errors that I can see. At least a hundred APs were on the 
list. The APs weren't the same type or in the same buildings. I can find no 
common thing at all about it.

No one called in to report any issues. I would think if they really did drop 
those on an affected AP would have noticed. Only one AP in the building housing 
IT was on the report, so perhaps not surprising that none of us noticed 
anything.

Has anyone out there seen anything like this? Aside from the unknown cause, is 
it possible for disassocation and reassociation happened fast enough that users 
wouldn't see any serious disruption if only doing stateless stuff? I'd have 
trouble believing the controller would report AP drops that didn't happen.
** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/discuss.

**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/discuss.



RE: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

2017-09-11 Thread Lee H Badman
OK- you have far fewer APs than us- but we are on your code. I don’t want to 
muddy the waters  and prejudice your conclusions as you really need TAC to work 
you through this minefield. But know that there are bugs and circumstances 
related to some shared resource between 802.1X and CAPWAP processes on the 
8540, but the things I have read tend to be for higher AP counts.  Whether any 
of this might apply to you, only the right debug will show.

Good luck-

Lee


Lee Badman | Network Architect

Certified Wireless Network Expert (#200)
Information Technology Services
206 Machinery Hall
120 Smith Drive
Syracuse, New York 13244
t 315.443.3003   f 315.443.4325   e lhbad...@syr.edu w 
its.syr.edu
SYRACUSE UNIVERSITY
syr.edu

From: The EDUCAUSE Wireless Issues Constituent Group Listserv 
[mailto:WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU] On Behalf Of Mark Duling
Sent: Monday, September 11, 2017 4:02 PM
To: WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU
Subject: Re: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

>> Out of curiosity- how many APs, clients (in general) and are you doing 
>> 802.1X?

It's roughly 250 APs out of approximately 1k, so about 1/4 our the total APs. 
Yes, we are doing 802.1x.

We don't have a large set of mobility groups, so it doesn't sound like 
CSCva66176.

On Mon, Sep 11, 2017 at 12:03 PM, Lee H Badman 
> wrote:
Out of curiosity- how many APs, clients (in general) and are you doing 802.1X?

Lee Badman | Network Architect

Certified Wireless Network Expert (#200)
Information Technology Services
206 Machinery Hall
120 Smith Drive
Syracuse, New York 13244
t 315.443.3003   f 315.443.4325   e 
lhbad...@syr.edu w its.syr.edu
SYRACUSE UNIVERSITY
syr.edu

From: The EDUCAUSE Wireless Issues Constituent Group Listserv 
[mailto:WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU]
 On Behalf Of Mark Duling
Sent: Monday, September 11, 2017 2:48 PM
To: 
WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU
Subject: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

We're using Cisco 8540 on code 8.2.151.0. Last week CPI reported a great number 
of simultaneous AP disassociations and then reassociation. CPI shows all the 
events had the exact same timestamp right down to the hundredth second. It was 
just a single event.

But I can find no event preceding it that would cause such a thing. No 
preceding controller errors that I can see. At least a hundred APs were on the 
list. The APs weren't the same type or in the same buildings. I can find no 
common thing at all about it.

No one called in to report any issues. I would think if they really did drop 
those on an affected AP would have noticed. Only one AP in the building housing 
IT was on the report, so perhaps not surprising that none of us noticed 
anything.

Has anyone out there seen anything like this? Aside from the unknown cause, is 
it possible for disassocation and reassociation happened fast enough that users 
wouldn't see any serious disruption if only doing stateless stuff? I'd have 
trouble believing the controller would report AP drops that didn't happen.
** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/discuss.
** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/discuss.

** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/discuss.

**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/discuss.



Re: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

2017-09-11 Thread Mark Duling
>> Out of curiosity- how many APs, clients (in general) and are you doing
802.1X?

It's roughly 250 APs out of approximately 1k, so about 1/4 our the total
APs. Yes, we are doing 802.1x.

We don't have a large set of mobility groups, so it doesn't sound
like CSCva66176.

On Mon, Sep 11, 2017 at 12:03 PM, Lee H Badman  wrote:

> Out of curiosity- how many APs, clients (in general) and are you doing
> 802.1X?
>
>
>
> *Lee Badman* | Network Architect
>
> Certified Wireless Network Expert (#200)
> Information Technology Services
> 206 Machinery Hall
> 120 Smith Drive
> Syracuse, New York 13244
>
> *t* 315.443.3003 <(315)%20443-3003>  * f* 315.443.4325 <(315)%20443-4325>
> *e* lhbad...@syr.edu *w* its.syr.edu
>
> *SYRACUSE UNIVERSITY*
> syr.edu
>
>
>
> *From:* The EDUCAUSE Wireless Issues Constituent Group Listserv [mailto:
> WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU] *On Behalf Of *Mark Duling
> *Sent:* Monday, September 11, 2017 2:48 PM
> *To:* WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU
> *Subject:* [WIRELESS-LAN] spurious cpi report of mass AP disassociation
>
>
>
> We're using Cisco 8540 on code 8.2.151.0. Last week CPI reported a great
> number of simultaneous AP disassociations and then reassociation. CPI shows
> all the events had the exact same timestamp right down to the hundredth
> second. It was just a single event.
>
>
>
> But I can find no event preceding it that would cause such a thing. No
> preceding controller errors that I can see. At least a hundred APs were on
> the list. The APs weren't the same type or in the same buildings. I can
> find no common thing at all about it.
>
>
>
> No one called in to report any issues. I would think if they really did
> drop those on an affected AP would have noticed. Only one AP in the
> building housing IT was on the report, so perhaps not surprising that none
> of us noticed anything.
>
>
>
> Has anyone out there seen anything like this? Aside from the unknown
> cause, is it possible for disassocation and reassociation happened fast
> enough that users wouldn't see any serious disruption if only doing
> stateless stuff? I'd have trouble believing the controller would report AP
> drops that didn't happen.
>
> ** Participation and subscription information for this EDUCAUSE
> Constituent Group discussion list can be found at http://www.educause.edu/
> discuss.
> ** Participation and subscription information for this EDUCAUSE
> Constituent Group discussion list can be found at http://www.educause.edu/
> discuss.
>
>

**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/discuss.



RE: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

2017-09-11 Thread Manon Lessard
Hi

I saw such events when we faced bug 
CSCva66176 but it was 
massive and recurrent.
When we moved from WISM2 to 8540s last May, thinking I should be careful I kept 
my old controllers (that were turned off) in the list of devices in the 
mobility group so that we’d have a quick rollback. According to the initial 
description of the bug we were not at risk but when the student (and yours 
truly) came back last week, surprise!

Resolution was to delete the old controllers and the rest is history.

Might not be exactly what you are experiencing though...

Manon Lessard
Technicienne en développement de systèmes
CCNP, CWNA, CWDP
Direction des technologies de l'information
Pavillon Louis-Jacques-Casault
1055, avenue du Séminaire
Bureau 0403
Université Laval, Québec (Québec)
G1V 0A6, Canada

418 656-2131, poste 12853
Télécopieur : 418 656-7305
manon.less...@dti.ulaval.ca
www.dti.ulaval.ca

Avis relatif à la confidentialité | Notice of 
Confidentiality



[Description : Description : Description : Description : Description : 
Description : Description : Description : Description : Description : 
Description : Description : Description : Description : Description : 
Description : Description : Description : Description : Logo de l'Université 
Laval]



From: The EDUCAUSE Wireless Issues Constituent Group Listserv 
[mailto:WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU] On Behalf Of Lee H Badman
Sent: 11 septembre 2017 15:04
To: WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU
Subject: Re: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

Out of curiosity- how many APs, clients (in general) and are you doing 802.1X?

Lee Badman | Network Architect

Certified Wireless Network Expert (#200)
Information Technology Services
206 Machinery Hall
120 Smith Drive
Syracuse, New York 13244
t 315.443.3003   f 315.443.4325   e lhbad...@syr.edu w 
its.syr.edu
SYRACUSE UNIVERSITY
syr.edu

From: The EDUCAUSE Wireless Issues Constituent Group Listserv 
[mailto:WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU] On Behalf Of Mark Duling
Sent: Monday, September 11, 2017 2:48 PM
To: 
WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU
Subject: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

We're using Cisco 8540 on code 8.2.151.0. Last week CPI reported a great number 
of simultaneous AP disassociations and then reassociation. CPI shows all the 
events had the exact same timestamp right down to the hundredth second. It was 
just a single event.

But I can find no event preceding it that would cause such a thing. No 
preceding controller errors that I can see. At least a hundred APs were on the 
list. The APs weren't the same type or in the same buildings. I can find no 
common thing at all about it.

No one called in to report any issues. I would think if they really did drop 
those on an affected AP would have noticed. Only one AP in the building housing 
IT was on the report, so perhaps not surprising that none of us noticed 
anything.

Has anyone out there seen anything like this? Aside from the unknown cause, is 
it possible for disassocation and reassociation happened fast enough that users 
wouldn't see any serious disruption if only doing stateless stuff? I'd have 
trouble believing the controller would report AP drops that didn't happen.
** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/discuss.
** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/discuss.

**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/discuss.



RE: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

2017-09-11 Thread Lee H Badman
Out of curiosity- how many APs, clients (in general) and are you doing 802.1X?

Lee Badman | Network Architect

Certified Wireless Network Expert (#200)
Information Technology Services
206 Machinery Hall
120 Smith Drive
Syracuse, New York 13244
t 315.443.3003   f 315.443.4325   e lhbad...@syr.edu w 
its.syr.edu
SYRACUSE UNIVERSITY
syr.edu

From: The EDUCAUSE Wireless Issues Constituent Group Listserv 
[mailto:WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU] On Behalf Of Mark Duling
Sent: Monday, September 11, 2017 2:48 PM
To: WIRELESS-LAN@LISTSERV.EDUCAUSE.EDU
Subject: [WIRELESS-LAN] spurious cpi report of mass AP disassociation

We're using Cisco 8540 on code 8.2.151.0. Last week CPI reported a great number 
of simultaneous AP disassociations and then reassociation. CPI shows all the 
events had the exact same timestamp right down to the hundredth second. It was 
just a single event.

But I can find no event preceding it that would cause such a thing. No 
preceding controller errors that I can see. At least a hundred APs were on the 
list. The APs weren't the same type or in the same buildings. I can find no 
common thing at all about it.

No one called in to report any issues. I would think if they really did drop 
those on an affected AP would have noticed. Only one AP in the building housing 
IT was on the report, so perhaps not surprising that none of us noticed 
anything.

Has anyone out there seen anything like this? Aside from the unknown cause, is 
it possible for disassocation and reassociation happened fast enough that users 
wouldn't see any serious disruption if only doing stateless stuff? I'd have 
trouble believing the controller would report AP drops that didn't happen.
** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/discuss.

**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/discuss.



spurious cpi report of mass AP disassociation

2017-09-11 Thread Mark Duling
We're using Cisco 8540 on code 8.2.151.0. Last week CPI reported a great
number of simultaneous AP disassociations and then reassociation. CPI shows
all the events had the exact same timestamp right down to the hundredth
second. It was just a single event.

But I can find no event preceding it that would cause such a thing. No
preceding controller errors that I can see. At least a hundred APs were on
the list. The APs weren't the same type or in the same buildings. I can
find no common thing at all about it.

No one called in to report any issues. I would think if they really did
drop those on an affected AP would have noticed. Only one AP in the
building housing IT was on the report, so perhaps not surprising that none
of us noticed anything.

Has anyone out there seen anything like this? Aside from the unknown cause,
is it possible for disassocation and reassociation happened fast enough
that users wouldn't see any serious disruption if only doing stateless
stuff? I'd have trouble believing the controller would report AP drops that
didn't happen.

**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/discuss.



Cisco ISE issues version 2.1.0.474 (Patch 3)

2017-09-11 Thread Matlick, Michael
Good afternoon,

We are curious if there are any other campuses out there which are running 
Cisco ISE version 2.1.0.474 (patch 3).  We are running into issues where users 
on our 802.1x enabled SSID are able to authenticate sometimes, but not always 
(intermittent connectivity).  Re-authentications for devices that are able to 
connect may also fail, leading to device disconnects.  We see messages on the 
controllers stating that the ISE PSNs are not able to respond to authentication 
requests on time, and even after disabling aggressive failover, the controllers 
(WiSM 2 and WLC 5520) are still failing over between our two ISE PSN nodes for 
authentications.  TAC seems to be taking a while to figure out what our issues 
might be.

We found a bug report during our own troubleshooting at 
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvc69935/?referring_site=bugquickviewredir
 "ISE 2.1 dropping radius traffic for stuck packet in duplicate manager", and 
we just want to know if anyone else has or is running this version and have run 
into similar issues?

In case some are not able to view the bug notes, the symptoms listed are:

Symptom:
frequent radius drops very early (with in the first 10 steps) in the detail 
logs of a failed authentication

endpoint frequently abandoning eap session or stopped responding - could be 
during peap tunnel establishment

high authentication latency with little to no load problems on PSN.

ISE failure reasons: packet already in process,

ISE failed messages for unexpected eap fragment or invalid radius attribute

prrt-server logs show duplicateManager setting nasip + source port + pktid in 
dup list with "added=true" the auth process finishes for a known duplicate 
session but the duplicatmanager never sets the same combo of nasip + source 
port + pktid to removed=true Next incomming packet with same NASIP, source 
port, and pktid combo will be rejected as "duplicate: even though it might be 
fore a different session, or endpoint or even auth method.i.e. MAB vs Dot1x

Known Affected Releases:

(1)

2.1(0.474)
Known Fixed Releases:

(1)

2.1(0.904)

We have seen the "ISE failure reasons: packet already in process" and "high 
authentication latency with little to no load problems on PSN" at least.  As 
far as I'm aware we still haven't confirmed this is exactly the problem we're 
facing as we're still working with TAC, but the symptoms seem similar.

We think it might be best to just go ahead and upgrade to the next patch 
(2.1(0.904)) even though it was recommended to us by Cisco and done early last 
week, but we'd like to know if anyone else has experienced similar issues and 
what was the solution.  ISE upgrades take a long time per node (and we have 4 
total) which is why it is not such an easy decision.

Maybe this can also serve as a sort of warning in case others are planning an 
ISE upgrade soon and aren't quite at version 2.2+ yet.


Thank you,

Michael Matlick
Network Control Specialist
University of Maryland, Baltimore CITS


**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/discuss.



eduroam AUP, revisit

2017-09-11 Thread Lee H Badman
Sorry to rehash a topic like this, but throwing the net out there again after 
only getting one reply (Thanks, Marcello). How are you who participate in 
eduroam as IDPs (Identity Providers)  making "reasonable effort" to inform your 
users about their responsibilities when visiting other campuses and using 
eduroam?

Thanks-

Lee


Lee Badman | Network Architect

Certified Wireless Network Expert (#200)
Information Technology Services
206 Machinery Hall
120 Smith Drive
Syracuse, New York 13244

t 315.443.3003   f 315.443.4325   e lhbad...@syr.edu w 
its.syr.edu

SYRACUSE UNIVERSITY
syr.edu




**
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/discuss.