On 24 March 2012 15:25, David Herman <[email protected]> wrote:

> > Presumably the JS source, as a sequence of UTF-16 code units, represents
> the tetragram code points as surrogate pairs.
>
> Clarification: the JS source *of the regexp literal*.
>
>
We certainly can, although this means that certain Unicode Strings cannot
be matched by a regexp with this flag. These strings would be the ones
containing reserved code points.

That said, why is the JS source suddenly a sequence of UTF-16 code units?I
believe JS source code should be a sequence of Unicode code points (and I
think ES5 says something to this effect).

The underlying transport format should not be a concern for the JS lexer.
The lexer should receive a series of code points from the network
transport, allowing web sites to transmit JS in whatever encoding they see
fit, provided the browser and server can both agree on it.  I think UTF-8
would make a fine transport format for JS source code.  IMHO the transport
format between the browser and the JS lexer [i.e. the input program
encoding] should be allowed to be implementation-defined and not specified
by TC-39.

Wes

-- 
Wesley W. Garland
Director, Product Development
PageMail, Inc.
+1 613 542 2787 x 102
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to