travis+ml-lang...@subspacefield.org writes:

> So Zalewski says that because of parser divergence, the only safe way
> to proxy HTML (and do any security filtering) is to parse the HTML via
> a method that is widely used by browsers into an AST and reconstruct
> the HTML from the AST, ignoring all unknown tags and attributes and
> unsafe constructs (using the safest encoding, for example).  Many
> people start from an assumption that something smaller is sufficient,
> but they are wrong.

For the record, an example of such a sanitizer implemented in Python:
<http://code.google.com/p/soclone/source/browse/trunk/soclone/utils/html.py>

> I presume something similar is useful for HTTP as well, where there's
> similar ambiguities in repeated header lines, content-type, encoding,
> file attachments, quoted strings, and so on.

Repeated header lines in HTTP are only allowed if they can be combined
as a legitimate comma-separated list value. See RFC 7320, Section 3.2.2:

> A sender MUST NOT generate multiple header fields with the same field
> name in a message unless either the entire field value for that
> header field is defined as a comma-separated list [i.e., #(values)]
> or the header field is a well-known exception (as noted below).

<https://tools.ietf.org/html/rfc7230#section-3.2.2>

> An interesting paper would be on XML parser engines, how they operate,
> and what that means for langsec.  Because the parser engine and whether
> it works on callbacks or ASTs and so on, combined with the semantics
> of parsing, creates divergences.

Can you show any examples for XML parser engine differentials?

> And oh, if you happen to be in the San Francisco Bay Area, we have a
> CFP out (1 Mar 2015 deadline), need volunteers, and sponsors:
>
> https://bsidessf.com/w/index.php/Main_Page
>
> Cheers :-)
> -- 
> http://www.subspacefield.org/~travis/
> Split a packed field and I am there; parse a line of text and you will find 
> me.
>
>
>
>
>
>
> _______________________________________________
> langsec-discuss mailing list
> langsec-discuss@mail.langsec.org
> https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss

-- 
Nils Dagsson Moskopp // erlehmann
<http://dieweltistgarnichtso.net>

Attachment: pgp_FyPa1CZtd.pgp
Description: PGP signature

_______________________________________________
langsec-discuss mailing list
langsec-discuss@mail.langsec.org
https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss

Reply via email to