Tony Li wrote (on Fri 04-Jan-2013 at 17:02 +0000): > To: Jeff Wheeler .... > I agree that the Message Length will either be correct or it won't. > Now, all us need is an oracle (the computer science kind ;-), to > tell us which is which. Got any handy? ;-)
The outermost framing for the message is the Marker (16 x 0xFF) followed by two octets of Message Length and one of Type. Message Length must be > 19 <= 4096 (currently), and >= 23 for an UPDATE message. So, there are some constraints there. The next level of framing is the Withdrawn Routes Length and the Total Path Attributes Length. And 23 + those must equal the Message Length. That (as per the drafts) is about it, oracle-wise. A CRC would be nice -- but not amongst the Attributes, except, perhaps between the MP_XXX at the front of the Attributes and the rest. But long runs of 0xFF are pretty rare in the body of a message, so the Marker is not bad. When we start considering what buggy software might be capable of throwing out, I think we are in danger of running, Wile-E.-Coyote-wise, out into mid-air... Most code is tested before release. Some parts of the code are in constant use. I think it is practical to consider some things dependable. (Never 100% guaranteed, but many-sigma.) The outermost framing for a BGP Message is common to all messages. The code which writes the Message Length can be written so that the value is very closely tied to the value used to dispatch the message to the output buffers. This stuff is pretty well exercised, and I think that a systematic error here is unlikely. I can imagine some memory management, or multi-threading, or random corruption, or other exotic bug which could affect this outermost framing. And such a bug could lie dormant for a long time. These sorts of issues are notoriously capricious and hard to reproduce. Which is a Bad Thing but also a Good Thing, because following a session reset the bug may never be seen again ! The inner framing of Withdrawn Routes, Path Attribute and NLRI is also code in constant use. Given a discrepancy between the lengths of these and the Message Length, if pushed I would go with the Message Length. But, I would prefer to treat such a discrepancy as a session-reset. Mostly because I think that the balance of probabilities is that an error here is more likely to be a symptom of some exotic error than anything else. [Plenty of hostages to fortune there... I look forward to hearing of counter examples :-)] This is all per the drafts. When it comes down to it, the most likely place for systematic errors to occur is in obscure corners of attribute handling. eg the well-known RIPE/DUKE "experiment". I think that concentrating on those would be the most productive. There are enough issues to resolve there to keep one busy. It occurs to me: one issue with treating the Marker etc. as the one true message boundary, is that if the code finds itself reading forwards, looking for the next start of message, that implies that the Message Length on the _previous_ one was probably wrong... but all decisions relating to that previous message have already been made :-( That's not a problem if you reset the session. In extremis I guess one has bigger issues to worry about ! An option to resync on Marker etc, as a last-ditch, keep-the-session-up-at-all-costs measure, doesn't seem that difficult to me -- and wouldn't (key consideration) interfere with code for normal running. Chris _______________________________________________ GROW mailing list [email protected] https://www.ietf.org/mailman/listinfo/grow
