Questions regarding ES6 Unicode regular expressions

Mathias Bynens Mon, 25 Aug 2014 02:00:07 -0700

Norbert’s original proposal for the `u` flag 
(http://norbertlindenberg.com/2012/05/ecmascript-supplementary-characters/#RegExp)
 mentioned the following:


> Possibly the definition of the character classes `\d\D\w\W\b\B` is extended 
> to their Unicode extensions, such as all characters in the Unicode category 
> “Number, decimal” for `\d`, as proposed by Steven Levithan. Whether this can 
> be done under the same flag or requires a different one still needs 
> discussion.

Has this been discussed any further? (I couldn’t find any mention of it in the 
meeting notes repository.) Should I file a bug?

Norbert also suggested replacing ‘characters’ with ‘code points’ in sections 
like 
https://people.mozilla.org/~jorendorff/es6-draft.html#sec-characterclassescape 
and 
https://people.mozilla.org/~jorendorff/es6-draft.html#sec-runtime-semantics-charactersetmatcher-abstract-operation
 when the `u` flag is set. It seems the intent was to make e.g. `/\d/u` match 
`/[0-9]/`, and `/\D/u` match all Unicode code points except `[0-9]`. This is 
different from `/\D/` which only matches BMP code points.

It seems like this change has not propagated to the spec draft, though. Is this 
correct, and if so, what’s the reason for that?

The same goes for `/[^a]/u` – should this match all Unicode code points except 
`a` or should it only match BMP code points?

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Questions regarding ES6 Unicode regular expressions

Reply via email to