Re: proposed relationships of Secure EcmaScript, ES3.1, and ES4.

Brendan Eich Thu, 21 Feb 2008 00:24:04 -0800

On Feb 20, 2008, at 6:10 PM, Mike Samuel wrote:

>     JSON ⊂ ADsafe ⊂ Cajita ⊂ Caja ⊂ ES3 ⊂ ES4


People who know Unicode are dangerous ;).


Yes, we need more of you ;-).

There's three problems according to my reading of http://www.ietf.org/rfc/rfc4627.txt but only the first is directly relatedto syntax:
(1) There are JSON programs that are not valid ES programs.
The JSON program [ "\u2028" ] where the unicode escape is replacedwith its literal equivalent is valid according to JSON since theset of characters that can appear in a string unescaped is
unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
but ES does not allow codepoint 0x2028 or 0x2029 to appearunescaped in a string since they are newline characters.

I wonder if JSON should not change on this point. Is there a use-casefor unescaped line/paragraph separators in strings?

(2) There are JSON programs that have the same text as ES programsbut different meaning.ES262 says that all format control codepoints, such as 0x200C,should be stripped out of the program in a pre-lex phase. This isnot consistently implemented:eval("'\u200c'.length") == 0 on SpiderMonkey, and 1 on mostother interpreters

Not lately, meaning post-Firefox-2/JS1.7. Fresh js shell, sameresults for Firefox 3 any beta:


js> eval("'\u200c'.length") == 0
false
js> eval("'\u200c'.length")
1

See https://bugzilla.mozilla.org/show_bug.cgi?id=274152, whereSpiderMonkey yields to IE JScript's flouting of ECMA-262. IE set areal-world web standard, and for the better according to people incertain locales.

According to https://bugzilla.mozilla.org/show_bug.cgi?id=368516#c34,IE does not report illegal character errors correctly, insteadtreating misplaced BOMs as identifiers whose references result inruntime ReferenceErrors (I don't know what it does with other format-control characters that occur outside of strings and regexps).

See also the follow-on bug to tolerate mislocated BOMs, https://bugzilla.mozilla.org/show_bug.cgi?id=368516. Ain't the copy/pasteInternet grand?

JSON does not strip these characters out, so they are treated assignificant.

ES4 is specifying as a bug fix to match other browsers that format-control characters shall not be stripped; it must also, to be a real-world web standard, specify tolerance for mislocated BOMs. Postel'sLaw bites back!


So JSON and ES4 will agree on this one.

(3) There are JSON programs that can be parsed to ES but thatcannot be serialized back to JSON without losing track of whereinfo was lost.JSON does not put any limits on numbers, but ES does. ES willtreat 1e1000 as Infinity. Since JSON does not have a valueInfinity, it is unclear how to implement toJSON(fromJSON("[1e1000]")).

JSON's grammar is nice and simple, it facilitates exhaustive testing(Rob Sayre used Koushik Sen's jCUTE to generate all-paths tests for aJava implementation).

BigInts or BigNums could help in the future, but the installed basewill not have them for a while and their literal syntax, without apragma, will have a suffix.

This kind of edge case is unlikely to be a problem in practice,although such "overflow" conditions recur throughout the securityexploit literature. Could JSON stand to grow support for the IEEE-754non-finite values?

/be

_______________________________________________
Es4-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es4-discuss

Re: proposed relationships of Secure EcmaScript, ES3.1, and ES4.

Reply via email to