On 25/06/15 06:00, travis+ml-lang...@subspacefield.org wrote:
> https://stackoverflow.com/questions/30727515/why-is-executing-java-code-in-comments-with-certain-unicode-characters-allowed?stw=2

Javascript/ECMAScript has different rules for Unicode escapes than Java. It 
doesn't
convert \u escapes before lexing; it only interprets them in identifiers and 
strings.

<http://www.ecma-international.org/ecma-262/6.0/index.html#sec-comments>
<http://www.ecma-international.org/ecma-262/6.0/index.html#sec-names-and-keywords>
<http://www.ecma-international.org/ecma-262/6.0/index.html#sec-literals-string-literals>

(This was the same in previous versions, and also in vendor implementations of
Javascript, although there were differences in sets of allowed characters and
escapes.)

Note that MIME charset decoding *is* done before interpreting Javascript. Also
HTML or XML entity expansion is potentially tricky, if the Javascript is 
embedded
in those.

(If you want to allow only a safe subset, see the FILTER_CDATA rule of
<http://jacaranda.org/jacaranda-spec-0.46.txt>. Note: Jacaranda is a dead 
project;
I am no longer confident that the general approach it used is sound, and my 
current
focus is on new languages built from scratch for security.)

-- 
Daira Hopwood ⚥

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
langsec-discuss mailing list
langsec-discuss@mail.langsec.org
https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss

Reply via email to