Erik Corry wrote:
[...snip] what does /[^\0-\uFFFF\u{10000}]/ without /u match?
Without the /u flag it should behave exactly as it has done until now,
for reasons of backwards
compatibility.  On V8 that means that
/[\u{10000}]/
is the same as
/[u01{}]/

That sounds good. Not only does it avoid breaking from web-reality, it also means that regexes without /u don't need to use a weird mix of code unit and code point matching semantics, ever. To extend the backward compatible approach you prescribe here, the following should all be true when /u is not used:

* /\u{10}/ eq /u{10}/ (literal u repeated 10 times).
* Shorthand classes like \D, \S, and the dot match BMP code units only.
* [^\0-\uFFFF] eq [] eq (?!) eq \b\B. (All of these are used in real-world regexes.) * If ES6 or later adds \p{..} for Unicode property matching, it's limited to matching BMP code units.

In other words, without /u, all matching is restricted to BMP code units. With /u, all matching is code point based and works with full 21-bit Unicode.

This also provides another argument in favor of automatically implying /u in ES modules. It would be somewhat obnoxious to not let \u{..} work by default in modules.

-- Steven Levithan

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to