Also are you running any XML applications on the phones that may be bogging down the CPU?
On Tue, Nov 15, 2016 at 11:31 AM, Wes Sisk (wsisk) <[email protected]> wrote: > Adam, > > Are you using dot1x? There are some interesting things in that space. > > Otherwise, maybe get 9.4.2es3 to pickup the fix for > CSCuq88325 7965 7945 excessive core files cause phone stability > problems > > > -w > > > On Nov 15, 2016, at 8:50 AM, Pawlowski, Adam <[email protected]> wrote: > > All, > > We’re still looking at this with TAC, though the initial response was that > the 7941, 7961, etc done with hardware and software support. There was an > announcement on October 20th that said software maintenance ended > immediately (oops). Our timers and such are ubiquitous across our network, > all defaults, and we don’t have this problem elsewhere. I went with looking > for MAC change traps and didn’t run into anything, going down that road. > The phones don’t log any VLAN changes either in their logs. The phones are > going out of service for UCM Closed TCP or UCM Reset TCP, and we see what > looks like the UCM not responding back with the proper SCCP KeepAliveAck, > which causes the phone to sort of do nothing for 60 seconds. By then, since > both the phone is waiting 60.0 seconds and the UCM is as well to hear from > it, the connection is reset and closed. > > Phones that are not sharing the data VLAN have been fine, but, we cannot > implement that across this entire area due to the needed cabling, > switchports, etc. > > In another location we have these phones going what appears to be high CPU > – the latency on the phone goes way up, with ICMP response, the response of > the phone to buttons and actions, and the call suffers from high jitter and > broken conversations. Oddly enough, when we cap with SPAN enabled on the > phone, the data looks fine going through it. Power cycling the phone clears > this temporarily. > > Everyone thus far has wanted to go down the road of loss somewhere on the > network, but, as we continue to take captures, we see the conversation > complete at the UCM, and beyond the phone via “SPAN to PC port”, or at it > with SPAN at the edge – the phone application itself is simply not > responding in a timely manner, at least based on initial observation. > > Given the earlier response that these devices are now done with support, > this does not bode well, but, we are still looking. > > Regards, > > Adam Pawlowski > SUNYAB NCS > > *From:* Wes Sisk (wsisk) [mailto:[email protected] <[email protected]>] > *Sent:* Tuesday, November 08, 2016 12:50 PM > *To:* Pawlowski, Adam > *Cc:* Tommy Schlotterer; [email protected] > *Subject:* Re: [cisco-voip] Traffic Issues with 7900 Series Phones > > Not much visibility into L1/L2 on those phones; drop counters on the > webpage or phone UI is about all you get. > > Are the phones randomly unregistering? This is good baseline: https:// > supportforums.cisco.com/document/52176/understanding- > sccp-phone-unregistration-and-failover-networks-perspective > > If some sort of frame issue, correct, not many options. > > What are the nature of messages being retransmitted? > Also, anything interesting looking in the log files? > > One age old odd one is CDP timers out of sync btwn phone and switch. Phone > keeps IP but gets dumped into data vlan. Your choice on how to approach > that. > > One possibility: If phones are unregistering then check > the lastoutofservice reason on the phone, in the CM traces, or in the RTMT > reports if you’re on a new enough version. I *think* we got these phones > fixed to say “vlan change’ or ‘cdp timeout’ or ‘ip change’ something like > that if there were changes in the network interface. > > Alternatively take a few phones stick them in a port that not trunked but > in the voice vlan… do these exhibit the same problem? > > next ‘heuristic’ guess after that is possibly arp cache refresh on the > switch. have seen several issues where arp cache timeout was set low, > switch re-arp for many devices concurrently, arp response dropped by input > queue overflow and input queue drop. net result the switch ‘forgets’ which > port that phone is on. > > So…. where do the packets/frames EXIST and NOT EXIST in the network? > > -Wes > > On Nov 4, 2016, at 4:32 PM, Pawlowski, Adam <[email protected]> wrote: > > Wes, > > Thanks, that's good to know about ICMP. We've seen phones that get into a > state where they reply with response times all over the board, lossy, > which, Reset/Restart from the UCM does not rectify. Powering the device > down does clear the condition - the set is otherwise idle. I need to get > into one of those via SSH and pull the CPU to see if it is up at that time, > to see if there's an identifiable process that covers this. > > We did get some captures from in front of the firewall where the UCM > resides, and from a monitor session from the switch out at the edge where > the phone is connected. We can see the UCM sending re-transmissions to the > phone, and the phone eventually replying some time later. Unless there is a > reason for us to try and get a copper tap on the segment between the switch > and the phone, then, it would seem to be that there is some reason the > phone is not replying to the UCM. There is nothing behind the phone, or any > output buffer drops. Our delay here in reply is in some number of seconds, > so I don't believe there's any buffering involved that would be to that > extent. > > What I fear is that if we get to a point where we can determine there is > some frame that is an issue, these devices are past the point of any > patching being done.... as of a few weeks ago. But, since replacing phones > is not free and takes a bunch of time, I still have to come up with > something. I only saw a bug for large sized ICMPv6 with nothing > particularly helpful in the wording and the workaround of "don't do that" > so I'm not hopeful. > > We have our AM and SE aware of what is going on, and they've offered to > help, so I'm hopeful we can eventually confirm the reason we're having > trouble, even if we can't directly fix it. > > > Adam > > > -----Original Message----- > From: Wes Sisk (wsisk) [mailto:[email protected] <[email protected]>] > Sent: Friday, November 04, 2016 12:52 PM > To: Pawlowski, Adam > Cc: Tommy Schlotterer; [email protected] > Subject: Re: [cisco-voip] Traffic Issues with 7900 Series Phones > > Phones process ICMP traffic with low priority and throttling. This was > implemented to stem DoS attempts. Consider looking more at Voice Quality > effects, retransmits in packet captures, or parsing CCM traces for round > trip times. As you state these phones are relatively late in life and > therefore relatively stable. > > -Wes > > > On Nov 2, 2016, at 2:42 PM, Pawlowski, Adam <[email protected]> wrote: > > Tommy, > > Sorry about that. These are a mixed bag. 41/61 both G and G-GE > phones, with the gigabit ones primarily. Some SCCP, some SIP, mostly > 9.4.2SR1-1, but seen on 9.4.2SR2-2. PC attached or not, no difference, the > only difference we've been able to create that stops this, is changing the > data VLAN that runs through the phone to a different one, or something > null (with no PC). > > Adam > > > -----Original Message----- > From: Tommy Schlotterer [mailto:[email protected] > <[email protected]>] > Sent: Wednesday, November 02, 2016 2:37 PM > To: Pawlowski, Adam; [email protected] > Subject: RE: Traffic Issues with 7900 Series Phones > > What specific Models of phones eg. 41s/61s? or 40s/60s? > > Thanks > > Tommy > > Tommy Schlotterer | Systems Engineer > Presidio | www.presidio.com > 20 N. Saint Clair, 3rd Floor, Toledo, OH 43604 > D: 419.214.1415 | C: 419.706.0259 | [email protected] > > -----Original Message----- > From: cisco-voip [mailto:[email protected] > <[email protected]>] On Behalf > Of Pawlowski, Adam > Sent: Wednesday, November 02, 2016 2:23 PM > To: [email protected] > Subject: [cisco-voip] Traffic Issues with 7900 Series Phones > > After much hair pulling and frustration, I wanted to ask the group > here in case anyone has seen this or has any thought on what we should > be looking for. > > We have a number of 7900 series phones that have been exhibiting > issues that appear to me to be that the phone is getting hung up on > something. > > Some sort of frame or packet is screwing with the network chip/board > or the OS which is causing it trouble. I see missed traffic, missed > responses, high ICMP echo times - and phones that eventually get stuck > with their ICMP echo response times being all over the board - with > some report of call trouble and CMR showing crazy jitter. If I power > cycle the phone that clears and it works fine for a while. > > I realize these items are pretty much end of useful life, pretty much > all done with software support, and are going to drop off of the > compatibility matrix and probably functional support in the near > future. But, while we still have a ton of them - has anyone noted any > particular type of traffic that causes the 7900 series phones grief? > > I don't have loss on the network, there do not seem to be any > transient broadcast storms rolling by. We do see an increased amount > of mDNS, IPv6 (phones are v4 only) etc, but nothing stands out as > causing a particular problem. It just seems that whatever this is, is > causing a memory leak or something, wherein it gets bad enough that > things go to hell eventually. > > > Any thoughts? > > Adam P > SUNYAB > _______________________________________________ > cisco-voip mailing list > [email protected] > https://puck.nether.net/mailman/listinfo/cisco-voip > > > This message w/attachments (message) is intended solely for the use of > the intended recipient(s) and may contain information that is > privileged, confidential or proprietary. If you are not an intended > recipient, please notify the sender, and then please delete and > destroy all copies and attachments. Please be advised that any review > or dissemination of, or the taking of any action in reliance on, the > information contained in or attached to this message is prohibited. > > _______________________________________________ > cisco-voip mailing list > [email protected] > https://puck.nether.net/mailman/listinfo/cisco-voip > > > _______________________________________________ > cisco-voip mailing list > [email protected] > https://puck.nether.net/mailman/listinfo/cisco-voip > >
_______________________________________________ cisco-voip mailing list [email protected] https://puck.nether.net/mailman/listinfo/cisco-voip
