So Zalewski says that because of parser divergence, the only safe way
to proxy HTML (and do any security filtering) is to parse the HTML via
a method that is widely used by browsers into an AST and reconstruct
the HTML from the AST, ignoring all unknown tags and attributes and
unsafe constructs (using the safest encoding, for example).  Many
people start from an assumption that something smaller is sufficient,
but they are wrong.

I presume something similar is useful for HTTP as well, where there's
similar ambiguities in repeated header lines, content-type, encoding,
file attachments, quoted strings, and so on.

An interesting paper would be on XML parser engines, how they operate,
and what that means for langsec.  Because the parser engine and whether
it works on callbacks or ASTs and so on, combined with the semantics
of parsing, creates divergences.

And oh, if you happen to be in the San Francisco Bay Area, we have a
CFP out (1 Mar 2015 deadline), need volunteers, and sponsors:

https://bsidessf.com/w/index.php/Main_Page

Cheers :-)
-- 
http://www.subspacefield.org/~travis/
Split a packed field and I am there; parse a line of text and you will find me.






Attachment: pgpNovM3dAFIF.pgp
Description: PGP signature

_______________________________________________
langsec-discuss mailing list
langsec-discuss@mail.langsec.org
https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss

Reply via email to