On 25/06/15 06:00, travis+ml-lang...@subspacefield.org wrote: > https://stackoverflow.com/questions/30727515/why-is-executing-java-code-in-comments-with-certain-unicode-characters-allowed?stw=2
Javascript/ECMAScript has different rules for Unicode escapes than Java. It doesn't convert \u escapes before lexing; it only interprets them in identifiers and strings. <http://www.ecma-international.org/ecma-262/6.0/index.html#sec-comments> <http://www.ecma-international.org/ecma-262/6.0/index.html#sec-names-and-keywords> <http://www.ecma-international.org/ecma-262/6.0/index.html#sec-literals-string-literals> (This was the same in previous versions, and also in vendor implementations of Javascript, although there were differences in sets of allowed characters and escapes.) Note that MIME charset decoding *is* done before interpreting Javascript. Also HTML or XML entity expansion is potentially tricky, if the Javascript is embedded in those. (If you want to allow only a safe subset, see the FILTER_CDATA rule of <http://jacaranda.org/jacaranda-spec-0.46.txt>. Note: Jacaranda is a dead project; I am no longer confident that the general approach it used is sound, and my current focus is on new languages built from scratch for security.) -- Daira Hopwood ⚥
signature.asc
Description: OpenPGP digital signature
_______________________________________________ langsec-discuss mailing list langsec-discuss@mail.langsec.org https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss