Re: [c-nsp] me3600 ospf %100 cpu blowup
On 15/Jan/18 16:25, Aaron Gould wrote: > ospf neighbors won't come up either with different mtu's That is what I remember as well, yes... 2007 was the last time I ran anything OSPF. Mark. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
ospf neighbors won't come up either with different mtu's -Original Message- From: cisco-nsp [mailto:cisco-nsp-boun...@puck.nether.net] On Behalf Of Mark Tinka Sent: Monday, January 15, 2018 8:00 AM To: Aaron Cc: cisco-nsp@puck.nether.net Subject: Re: [c-nsp] me3600 ospf %100 cpu blowup On 14/Jan/18 17:36, Aaron wrote: > Size of the ospf table Been a long while since I ran OSPF in production - but I know IS-IS tests the MTU as adjacencies are built, and won't work unless PDU's are sent unfragmented across the wire. Mark. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
I had something similar happen to me a couple months ago, and posted it here... [c-nsp] ospf database size - affects that underlying transport mtu might have https://www.mail-archive.com/cisco-nsp@puck.nether.net/msg65794.html - Aaron ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
On 14/Jan/18 17:36, Aaron wrote: > Size of the ospf table Been a long while since I ran OSPF in production - but I know IS-IS tests the MTU as adjacencies are built, and won't work unless PDU's are sent unfragmented across the wire. Mark. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
Size of the ospf table On Sunday, January 14, 2018, Mark Tinkawrote: > > > On 13/Jan/18 18:33, adamv0...@netconsultings.com wrote: > > > Hmm could it be that you hit the mtu limit of your links (which is not > 9216 > > but just 9000)? > > That would make sense - but if it's been working all this time, what > changed? > > Is your transport network dark or leased? > > Mark. > > ___ > cisco-nsp mailing list cisco-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/cisco-nsp > archive at http://puck.nether.net/pipermail/cisco-nsp/ > ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
On 13/Jan/18 18:33, adamv0...@netconsultings.com wrote: > Hmm could it be that you hit the mtu limit of your links (which is not 9216 > but just 9000)? That would make sense - but if it's been working all this time, what changed? Is your transport network dark or leased? Mark. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
> Mike > Sent: Sunday, January 14, 2018 2:32 AM > > > >> mtu 9000' and the problem has not come back since > >> (>10 hours now). > >> > >> > > Hmm could it be that you hit the mtu limit of your links (which is not > > 9216 but just 9000)? > > > > adam > > Thats what Im thinking; the hardware mtu was set to 9216 but for whatever > reason the mtu actually was not that and so ospf flooding couldn't complete > since full mtu sized packets were lost/damaged. > Setting the ip mtu down to 9000 was the answer. Ugh. > Try pinging with mtu higher than 9000 and fragmentation disabled to see if it indeed is the case. Or you can verify with the provider of these links, to see what is the mtu on these links. adam netconsultings.com ::carrier-class solutions for the telecommunications industry:: ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
>> mtu 9000' and the problem has not come back since >> (>10 hours now). >> >> > Hmm could it be that you hit the mtu limit of your links (which is not 9216 > but just 9000)? > > adam Thats what Im thinking; the hardware mtu was set to 9216 but for whatever reason the mtu actually was not that and so ospf flooding couldn't complete since full mtu sized packets were lost/damaged. Setting the ip mtu down to 9000 was the answer. Ugh. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
> Mike > Sent: Saturday, January 13, 2018 3:36 AM > > > My initial analysis was flawed and reload did not in fact address the problem. > I was able to observe the neighbors all having issues again and this time I saw > that a neighbor relationship was stuck in 'loading' > state. After a lot of looking around, I determined that the interface and ip > mtu both were 9216. I suspect that this mtu was too big and that my igp had > grown to a point where fragmentation was needed but due to an incorrect > mtu here, the updates were not working due to full size packets. I added 'ip > mtu 9000' and the problem has not come back since > (>10 hours now). > > Hmm could it be that you hit the mtu limit of your links (which is not 9216 but just 9000)? adam netconsultings.com ::carrier-class solutions for the telecommunications industry:: ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
On 13/Jan/18 05:36, Mike wrote: > So ... the plot thickens. > > My initial analysis was flawed and reload did not in fact address the > problem. I was able to observe the neighbors all having issues again and > this time I saw that a neighbor relationship was stuck in 'loading' > state. After a lot of looking around, I determined that the interface > and ip mtu both were 9216. I suspect that this mtu was too big and that > my igp had grown to a point where fragmentation was needed but due to an > incorrect mtu here, the updates were not working due to full size > packets. I added 'ip mtu 9000' and the problem has not come back since > (>10 hours now). Hmmh - not sure how this makes sense. But if it's working... Mark. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
On 01/12/2018 12:26 PM, Aaron Gould wrote: > I'll take a stab at it... > > Show log... (prior to reboot, so you may need to look at syslog...) > > If you see NILE ASIC errors of some sort, I recall TAC telling me there isn't > a fix and reboot is required. :| > > I recall the nile asic thing being l2vpn related so I dunno about the > ospf thing > > -Aaron > > > > So ... the plot thickens. My initial analysis was flawed and reload did not in fact address the problem. I was able to observe the neighbors all having issues again and this time I saw that a neighbor relationship was stuck in 'loading' state. After a lot of looking around, I determined that the interface and ip mtu both were 9216. I suspect that this mtu was too big and that my igp had grown to a point where fragmentation was needed but due to an incorrect mtu here, the updates were not working due to full size packets. I added 'ip mtu 9000' and the problem has not come back since (>10 hours now). Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] me3600 ospf %100 cpu blowup
I'll take a stab at it... Show log... (prior to reboot, so you may need to look at syslog...) If you see NILE ASIC errors of some sort, I recall TAC telling me there isn't a fix and reboot is required. :| I recall the nile asic thing being l2vpn related so I dunno about the ospf thing -Aaron ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
[c-nsp] me3600 ospf %100 cpu blowup
Hi, I have an me3600x that runs ospf and carries just a few eompls tunnels and tts been running for 2 1/2 years without a hiccup. It's a hub in the middle connecting a few other odd me3600', asr-920's and asr1. This morning, it stopped exchanging routes with one router (an asr1000, directly attached) which logged messages about LDP being down and too many ospf retransmissions. Doing 'clear ip ospf 1 process' on the me3600x alleviated the problem for a while, and then it happened again. After a few go arounds trying to get logs and finding no smoking gun, and hitting clear ip ospf 1 process again.. the whole box freaked out and ospf went to nearly %100 cpu and at that point we were dead dead dead. Using oob management I was able to get back in, again no obvious logs, clear ip ospf 1 process and all was good. For about 5 minutes. Then ospf spiked again. I finally did a reload on the box assuming there must be some kind of real internal memory corruption. So far it's been holding since reload. Does this smell like a bug that anyone has run into before? My software is me360x_t-universalk9-mz.154-3.S2.bin and while a bit dated has been stable...until today. Thank you. Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/