The idea for /u and the following aspects of it already seem to have some
consensus:
- Switch from code unit to code point matching.
- Make \d\w\b Unicode-aware.
- Make /i use proper Unicode casefolding.
- Enable \u{x..} (break from web reality).
Since /u may be a one-time opportunity to broadly change RegExp semantics,
how about adding another change on the pile?
- Break from web reality for escaped A-Z and a-z. Throw a SyntaxError when
any letter not assigned a special meaning is escaped, instead of matching
the literal character.
I.e., /\i/u etc. must throw a SyntaxError.
This is relevant to future Unicode support, because without breaking web
reality we might never be able to add \p{..} and \P{..} for Unicode
properties, \X for graphemes, \N{..} for named characters, etc.
Of course, this change would also make it easier to add any from a host of
special escapes in other regex libraries (such as \k<..> for named
backreferences) or new ES inventions. It's really ugly that such features
might not be able to be added by default everywhere, but them's the breaks,
I suppose (I hope I'm wrong).
We could go crazy and start fixing all of ES's RegExp warts when /u is
applied, even though such changes would not be related to Unicode support.
I'd be happy to pursue that, but I suspect many here would see it as a
bridge too far.
Thoughts?
-- Steven Levithan
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss