On Jan 3, 2013, at 10:00 AM, Tony Li <[email protected]> wrote: > > > All of the marketing that you're doing here is positioning this as a > 'solution'. It's not. Yes, it will stop the flap, but it does NOTHING to > fix or deal with the underlying bug. All it does is gloss it over, and as > such, it will have implications in the field whereby this papers over real > bugs and we have now promoted BGP errors into RIB errors. That's NOT making > things easier to debug, that's just applying a band-aid.
I understand what you are saying and I agree 100%, however, from an my operations perspective the "fix" is the same. Either upgrade to fixed code or policy out the offending announcement. I would rather deal with a customer routing issue vs a frantic call from our noc saying 15+ att peers globally are bouncing. The latter being a much bigger impact on our network. I can live with a couple of /24's not working for a few customers. I can't have 15+ peers bouncing because of bad updates and even more peers bouncing because of missed keepalives due to cpu pegged trying to deal with 15 peers bouncing globally. > > A more constructive way to address the real problem here would be to talk > about whether we should even re-establish the session after an error. Long > ago, we made an implementation decision to simply retry. That would seem to > be the real issue at hand. I would back this provided adequate logging as to why the session is down. It would be much like tripping max-prefixes where we could hard clear a single single session for debug. I could live with this. Mike -- Michael Long NTT Communications Global IP Network ph. 214.915.1352 jabber: [email protected] _______________________________________________ GROW mailing list [email protected] https://www.ietf.org/mailman/listinfo/grow
