Jeff, >> While we can do SOME things to decrease session resets, we cannot fix all >> cases and simply treating things as a withdraw and walking away is wholly >> unacceptable, as some of you will hopefully agree. Creating arbitrary hair >> here is NOT going to help as the error handling code itself will become >> fraught with errors. > > Every operator I've asked thinks "ignore bad BGP messages," which is > even more extreme than treat-as-withdraw, is a good idea.
I'm sure if you asked folks if they wanted anything that helped them and completely ignored the costs, they would say yes. However, if you want to make a reasonable, rational, and justified decision, you must consider the implications of the request. That's what I'm asking. Consider implementor input as well, because there are some practical considerations here. > Respectfully, you are about as wrong as one could get on this issue. Respectfully, I think you're misunderstanding my position completely. My point is that a reasonable implementation cannot possibly live up to the expectations that you're setting up here. To be specific, once an implementation loses the syntactic parsing of the data stream, realistically, the session is corrupt and an eventual reset is inevitable. Or, in other words, BGP cannot possible ignore bad messages. That's not the way it works. > Of course customers don't want one prefix to be broken. Fifteen years > of "CEF problem" have taught us all that these conditions are hard to > troubleshoot. However, it is often preferable to have one or many > prefixes broken, than have a BGP session flap endlessly due to some > bug. All of the marketing that you're doing here is positioning this as a 'solution'. It's not. Yes, it will stop the flap, but it does NOTHING to fix or deal with the underlying bug. All it does is gloss it over, and as such, it will have implications in the field whereby this papers over real bugs and we have now promoted BGP errors into RIB errors. That's NOT making things easier to debug, that's just applying a band-aid. A more constructive way to address the real problem here would be to talk about whether we should even re-establish the session after an error. Long ago, we made an implementation decision to simply retry. That would seem to be the real issue at hand. > The vendor should have some standards body coverage for giving the > operator this knob, and customers are right to ask for it. Sure, > you're giving us more rope. Sometimes that is what we need. Sorry, but the point of the standards body is to standardize PROTOCOL changes. Everything that has been discussed here are IMPLEMENTATION ISSUES. We don't standardize those, for very good reasons. And the vendors need zero help from the IETF in making implementation issues. If real customers want a particular behavior, they can always just ask for it, as always. > Related to this, what is your plan for dealing with BGP Attribute > re-ordering in the rewrite of RFC4760? My plan? My personal plan is to ban the use of all MP extensions, as all of that is simply evil and should be scrubbed off the face of the earth. I'll be putting this in place as soon as I'm elected Emperor of the Universe. ;-) > Currently there is no > mechanism for telling a BGP neighbor that you intend to send the > MP-NLRIs before other Attributes. Doing that goes against RFC4271's > recommendations. Relying on that behavior (e.g. for error-handling) > is not possible unless the neighbor promises to do it. Thus far, > there seems to be no intent to allocate another Capability Code for > this. The MP-BGP Capabilities Optional Parameter was not designed to > be extended to do other things besides announce to the neighbor what > AFI/SAFIs it supports. RFC 4271 doesn't require a specific ordering because it would be bad protocol design. As soon as you require ordering, some implementation is going to check that ordering. There will be more bugs that occur because the mandated ordering was not followed, and more sessions will be dropped. As always, the right thing is to follow Postel's law: be liberal in what you accept. Requiring a specific ordering in order to improve error handling is a bad tradeoff: you're creating a host of additional bugs so that you can try to simplify error handling on another set of bugs. Tony _______________________________________________ GROW mailing list [email protected] https://www.ietf.org/mailman/listinfo/grow
