Good, that sounds right.
Mark <https://google.com/+MarkDavis> *— Il meglio è l’inimico del bene —* On Wed, Jan 28, 2015 at 12:57 PM, André Bargull <andre.barg...@udo.edu> wrote: > On Wed, Jan 28, 2015 at 11:36 AM, Marja Hölttä <marja at chromium.org > <https://mail.mozilla.org/listinfo/es-discuss>> wrote: > > >* The ES6 unicode regexp spec is not very clear regarding what should happen > *>* if the regexp or the matched string contains lonely surrogates (a lead > *>* surrogate without a trail, or a trail without a lead). For example, for > the > *>* . operator, the relevant parts of the spec speak about characters: > *> > Just a bit of terminology. > > The term "character" is overloaded, so Unicode provides the unambiguous > term "code point". For example, U+0378 is not (currently) an encoded > character according to Unicode, but it would certainly be a terrible idea > to disregard it, or not match it. It is a reserved code point that may be > assigned as an encoded character in the future. So both U+D83D and U+0378 > are not characters. > > If a ES spec uses the term "character" instead of "code point", then at > some point in the text it needs to disambiguate what is meant. > > > "character" is defined in 21.2.2 Pattern Semantics [1]: > > In the context of describing the behaviour of a BMP pattern “character” > means a single 16-bit Unicode BMP code point. In the context of describing > the behaviour of a Unicode pattern “character” means a UTF-16 encoded code > point. > > > > [1] > https://people.mozilla.org/~jorendorff/es6-draft.html#sec-pattern-semantics >
_______________________________________________ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss