On Jan 3, 2013, at 10:00 AM, Tony Li <[email protected]> wrote:
> 
> 
> All of the marketing that you're doing here is positioning this as a 
> 'solution'.  It's not.  Yes, it will stop the flap, but it does NOTHING to 
> fix or deal with the underlying bug.  All it does is gloss it over, and as 
> such, it will have implications in the field whereby this papers over real 
> bugs and we have now promoted BGP errors into RIB errors.  That's NOT making 
> things easier to debug, that's just applying a band-aid.

I understand what you are saying and I agree 100%, however, from an my 
operations perspective the "fix" is the same. Either upgrade to fixed code or 
policy out the offending announcement. I would rather deal with a customer 
routing issue vs a frantic call from our noc saying 15+ att peers globally are 
bouncing. The latter being a much bigger impact on our network. 

I can live with a couple of /24's not working for a few customers. I can't have 
15+ peers bouncing because of bad updates and even more peers bouncing because 
of missed keepalives due to cpu pegged trying to deal with 15 peers bouncing 
globally. 

> 
> A more constructive way to address the real problem here would be to talk 
> about whether we should even re-establish the session after an error.  Long 
> ago, we made an implementation decision to simply retry.  That would seem to 
> be the real issue at hand.

I would back this provided adequate logging as to why the session is down. It 
would be much like tripping max-prefixes where we could hard clear a single 
single session for debug. I could live with this. 

Mike
-- 
Michael Long
NTT Communications Global IP Network
ph. 214.915.1352
jabber: [email protected]



_______________________________________________
GROW mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/grow

Reply via email to