Tony Li wrote (on Fri 04-Jan-2013 at 17:02 +0000):
> To: Jeff Wheeler
....
> I agree that the Message Length will either be correct or it won't.
> Now, all us need is an oracle (the computer science kind ;-), to
> tell us which is which.  Got any handy?  ;-)

The outermost framing for the message is the Marker (16 x 0xFF)
followed by two octets of Message Length and one of Type.  Message
Length must be > 19 <= 4096 (currently), and >= 23 for an UPDATE
message.  So, there are some constraints there.

The next level of framing is the Withdrawn Routes Length and the Total
Path Attributes Length.  And 23 + those must equal the Message Length.

That (as per the drafts) is about it, oracle-wise.  A CRC would be
nice -- but not amongst the Attributes, except, perhaps between the
MP_XXX at the front of the Attributes and the rest.  But long runs of
0xFF are pretty rare in the body of a message, so the Marker is not
bad.

When we start considering what buggy software might be capable of
throwing out, I think we are in danger of running,
Wile-E.-Coyote-wise, out into mid-air...

Most code is tested before release.  Some parts of the code are in
constant use.  I think it is practical to consider some things
dependable.  (Never 100% guaranteed, but many-sigma.)

The outermost framing for a BGP Message is common to all messages.
The code which writes the Message Length can be written so that the
value is very closely tied to the value used to dispatch the message
to the output buffers.  This stuff is pretty well exercised, and I
think that a systematic error here is unlikely.  I can imagine some
memory management, or multi-threading, or random corruption, or other
exotic bug which could affect this outermost framing.  And such a bug
could lie dormant for a long time.  These sorts of issues are
notoriously capricious and hard to reproduce.  Which is a Bad Thing
but also a Good Thing, because following a session reset the bug may
never be seen again !

The inner framing of Withdrawn Routes, Path Attribute and NLRI is also
code in constant use.  Given a discrepancy between the lengths of
these and the Message Length, if pushed I would go with the Message
Length.  But, I would prefer to treat such a discrepancy as a
session-reset.  Mostly because I think that the balance of
probabilities is that an error here is more likely to be a symptom of
some exotic error than anything else.

[Plenty of hostages to fortune there... I look forward to hearing of
counter examples :-)]

This is all per the drafts.

When it comes down to it, the most likely place for systematic errors
to occur is in obscure corners of attribute handling.  eg the
well-known RIPE/DUKE "experiment".  I think that concentrating on
those would be the most productive.  There are enough issues to
resolve there to keep one busy.

It occurs to me: one issue with treating the Marker etc. as the one
true message boundary, is that if the code finds itself reading
forwards, looking for the next start of message, that implies that the
Message Length on the _previous_ one was probably wrong...  but all
decisions relating to that previous message have already been made :-(
That's not a problem if you reset the session.  In extremis I guess
one has bigger issues to worry about !

An option to resync on Marker etc, as a last-ditch,
keep-the-session-up-at-all-costs measure, doesn't seem that difficult
to me -- and wouldn't (key consideration) interfere with code for
normal running.

Chris

_______________________________________________
GROW mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/grow

Reply via email to