Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-15 Thread Mark Tinka


On 15/Jan/18 16:25, Aaron Gould wrote:

> ospf neighbors won't come up either with different mtu's

That is what I remember as well, yes... 2007 was the last time I ran
anything OSPF.

Mark.
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-15 Thread Aaron Gould
ospf neighbors won't come up either with different mtu's

-Original Message-
From: cisco-nsp [mailto:cisco-nsp-boun...@puck.nether.net] On Behalf Of Mark
Tinka
Sent: Monday, January 15, 2018 8:00 AM
To: Aaron
Cc: cisco-nsp@puck.nether.net
Subject: Re: [c-nsp] me3600 ospf %100 cpu blowup



On 14/Jan/18 17:36, Aaron wrote:

> Size of the ospf table

Been a long while since I ran OSPF in production - but I know IS-IS tests
the MTU as adjacencies are built, and won't work unless PDU's are sent
unfragmented across the wire.

Mark.
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-15 Thread Aaron Gould
I had something similar happen to me a couple months ago, and posted it
here...

[c-nsp] ospf database size - affects that underlying transport mtu might
have

https://www.mail-archive.com/cisco-nsp@puck.nether.net/msg65794.html


- Aaron


___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-15 Thread Mark Tinka


On 14/Jan/18 17:36, Aaron wrote:

> Size of the ospf table

Been a long while since I ran OSPF in production - but I know IS-IS
tests the MTU as adjacencies are built, and won't work unless PDU's are
sent unfragmented across the wire.

Mark.
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-14 Thread Aaron
Size of the ospf table

On Sunday, January 14, 2018, Mark Tinka  wrote:

>
>
> On 13/Jan/18 18:33, adamv0...@netconsultings.com wrote:
>
> > Hmm could it be that you hit the mtu limit of your links (which is not
> 9216
> > but just 9000)?
>
> That would make sense - but if it's been working all this time, what
> changed?
>
> Is your transport network dark or leased?
>
> Mark.
>
> ___
> cisco-nsp mailing list  cisco-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-14 Thread Mark Tinka


On 13/Jan/18 18:33, adamv0...@netconsultings.com wrote:

> Hmm could it be that you hit the mtu limit of your links (which is not 9216
> but just 9000)?

That would make sense - but if it's been working all this time, what
changed?

Is your transport network dark or leased?

Mark.

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-14 Thread adamv0025
> Mike
> Sent: Sunday, January 14, 2018 2:32 AM
> 
> 
> >> mtu 9000' and the problem has not come back since
> >> (>10 hours now).
> >>
> >>
> > Hmm could it be that you hit the mtu limit of your links (which is not
> > 9216 but just 9000)?
> >
> > adam
> 
> Thats what Im thinking; the hardware mtu was set to 9216 but for whatever
> reason the mtu actually was not that and so ospf flooding couldn't
complete
> since full mtu sized packets were lost/damaged.
> Setting the ip mtu down to 9000 was the answer. Ugh.
> 
Try pinging with mtu higher than 9000 and fragmentation disabled to see if
it indeed is the case.
Or you can verify with the provider of these links, to see what is the mtu
on these links.

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::


___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-13 Thread Mike

>> mtu 9000' and the problem has not come back since
>> (>10 hours now).
>>
>>
> Hmm could it be that you hit the mtu limit of your links (which is not 9216
> but just 9000)?
>
> adam

Thats what Im thinking; the hardware mtu was set to 9216 but for
whatever reason the mtu actually was not that and so ospf flooding
couldn't complete since full mtu sized packets were lost/damaged.
Setting the ip mtu down to 9000 was the answer. Ugh.





___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-13 Thread adamv0025
> Mike
> Sent: Saturday, January 13, 2018 3:36 AM
> 
> 
> My initial analysis was flawed and reload did not in fact address the
problem.
> I was able to observe the neighbors all having issues again and this time
I saw
> that a neighbor relationship was stuck in 'loading'
> state. After a lot of looking around, I determined that the interface and
ip
> mtu both were 9216. I suspect that this mtu was too big and that my igp
had
> grown to a point where fragmentation was needed but due to an incorrect
> mtu here, the updates were not working due to full size packets. I added
'ip
> mtu 9000' and the problem has not come back since
> (>10 hours now).
> 
> 
Hmm could it be that you hit the mtu limit of your links (which is not 9216
but just 9000)?

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::


___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-12 Thread Mark Tinka


On 13/Jan/18 05:36, Mike wrote:

> So ... the plot thickens.
>
> My initial analysis was flawed and reload did not in fact address the
> problem. I was able to observe the neighbors all having issues again and
> this time I saw that a neighbor relationship was stuck in 'loading'
> state. After a lot of looking around, I determined that the interface
> and ip mtu both were 9216. I suspect that this mtu was too big and that
> my igp had grown to a point where fragmentation was needed but due to an
> incorrect mtu here, the updates were not working due to full size
> packets. I added 'ip mtu 9000' and the problem has not come back since
> (>10 hours now).

Hmmh - not sure how this makes sense. But if it's working...

Mark.
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-12 Thread Mike
On 01/12/2018 12:26 PM, Aaron Gould wrote:
> I'll take a stab at it...
>
> Show log... (prior to reboot, so you may need to look at syslog...)
>
> If you see NILE ASIC errors of some sort, I recall TAC telling me there isn't 
> a fix and reboot is required.  :|
>
> I recall the nile asic thing being l2vpn related so I dunno about the 
> ospf thing
>
> -Aaron
>
>
>
>

So ... the plot thickens.

My initial analysis was flawed and reload did not in fact address the
problem. I was able to observe the neighbors all having issues again and
this time I saw that a neighbor relationship was stuck in 'loading'
state. After a lot of looking around, I determined that the interface
and ip mtu both were 9216. I suspect that this mtu was too big and that
my igp had grown to a point where fragmentation was needed but due to an
incorrect mtu here, the updates were not working due to full size
packets. I added 'ip mtu 9000' and the problem has not come back since
(>10 hours now).


Mike-

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] me3600 ospf %100 cpu blowup

2018-01-12 Thread Aaron Gould
I'll take a stab at it...

Show log... (prior to reboot, so you may need to look at syslog...)

If you see NILE ASIC errors of some sort, I recall TAC telling me there isn't a 
fix and reboot is required.  :|

I recall the nile asic thing being l2vpn related so I dunno about the ospf 
thing

-Aaron


___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


[c-nsp] me3600 ospf %100 cpu blowup

2018-01-12 Thread Mike
Hi,   

    I have an me3600x that runs ospf and carries just a few eompls
tunnels and tts been running for 2 1/2 years without a hiccup. It's a
hub in the middle connecting a few other odd me3600', asr-920's and
asr1.

    This morning, it stopped exchanging routes with one router (an
asr1000, directly attached) which logged messages about LDP being down
and too many ospf retransmissions.  Doing 'clear ip ospf 1 process' on
the me3600x alleviated the problem for a while, and then it happened
again. After a few go arounds trying to get logs and finding no smoking
gun, and hitting clear ip ospf 1 process again.. the whole box freaked
out and ospf went to nearly %100 cpu and at that point we were dead dead
dead.

    Using oob management I was able to get back in, again no obvious
logs, clear ip ospf 1 process and all was good. For about 5 minutes.
Then ospf spiked again. I finally did a reload on the box assuming there
must be some kind of real internal memory corruption. So far it's been
holding since reload.

    Does this smell like a bug that anyone has run into before? My
software is me360x_t-universalk9-mz.154-3.S2.bin and while a bit dated
has been stable...until today.

Thank you.


Mike-


___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/