Sergey Bratus <ser...@cs.dartmouth.edu> writes: > Dear All, > > I was reading Yar Tikhiy's "IPv6 for IPv4 Experts" > (https://sites.google.com/site/yartikhiy/home/ipv6book), and came > across the following comment in section 4.4.1: > > ----- begin quote ----- > The fact is, for a long time the Internet standards were written for, and > by, top experts of the industry, who had no problem translating the wire > data format to a robust and secure receiver algorithm. So it is no > surprise that the terms "protocol" and "packet format" remained > interchangeable through a whole period of TCP/IP history. Alas, today the > average RFC reader’s level is nowhere as high as it used to be in the > 1970s, which has to be made up for by more rigorous and comprehensive > standards documents leaving no room for ambiguity. > > Granted, a protocol can be so complex that providing only its wire data > format will no longer be sufficient; but the failure of some implemen- > tations to check Version in incoming IP packets speaks of certain ills in > the prevalent technical culture rather than of this protocol’s complexity. > ----- end quote ------ > > It seems to show both awareness of the special role of (packet) > recognizers and of the dangers of ambiguity. Moreover, the suggested link > between protocol, format, and its receiver (=recognizer) sounds very much > like a LangSec insight.
One motivation for the HTML 5 specification was a similar insight – that implementors will never handle invalid input in similar ways. The HTML 5 state machine aims to unambiguously define how to create a DOM tree from markup, even invalid markup such as “<a><b></a></b>”. <http://www.whatwg.org/specs/web-apps/current-work/multipage/the-end.html#an-introduction-to-error-handling-and-strange-cases-in-the-parser> > However, the concluding note regarding protocol complexity (in the > Russian original, this statement is even stronger: explicit specification > of endpoints' behaviors is stressed) appears to draw no boundaries between > manageable and unmanageable complexity of either protocols or formats, > but rather blames the detrimental effects on the technical culture. > > Yet, as LangSec argues, there is a boundary beyond which protocol > implementation differences cannot be blamed on the poor culture of > implementors, as verifying equivalence of recognizer implementations > becomes undecidable. Blaming the implementors becomes, in essence, > blaming the victims of format complexity for their inability to solve > an equivalent of the halting problem, i.e., patently unfair. I consider it fair to blame them for not recognizing this as a problem. When I told a friend without a formal computer science background about recognizers he showed me code he wrote for parsing configuration files – it intentionally crashed on unexpected input. Intuitively, he knew that sometimes the winning move is not to play. A solution might avoid protocol complexity by only implementing parts below a specific complexity. Fefe recently wrote that he would only accept DER encoded data (unambiguous length encoding) even if the standard mentions BER encoded data: <https://blog.fefe.de/?ts=ada94b9c> Rejecting turing complete input languages has become a time-honored tradition on the world wide web, where browsing without Java became browsing without Flash became browsing without JavaScript: A simpler, safer alternative that not only avoids an entire attack surface but enables you to have 150 html buffers open on an old Thinkpad without having it crawl to a halt. > It seems to me that this is where LangSec offers the next step > beyond the undeniably great and time-honored intuitions of protocol > practitioners. Do you consider accepting only a safe subset of a language a proper course of action? -- Nils Dagsson Moskopp // erlehmann <http://dieweltistgarnichtso.net> _______________________________________________ langsec-discuss mailing list langsec-discuss@mail.langsec.org https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss