The idea for /u and the following aspects of it already seem to have some consensus:

- Switch from code unit to code point matching.
- Make \d\w\b Unicode-aware.
- Make /i use proper Unicode casefolding.
- Enable \u{x..} (break from web reality).

Since /u may be a one-time opportunity to broadly change RegExp semantics, how about adding another change on the pile?

- Break from web reality for escaped A-Z and a-z. Throw a SyntaxError when any letter not assigned a special meaning is escaped, instead of matching the literal character.

I.e., /\i/u etc. must throw a SyntaxError.

This is relevant to future Unicode support, because without breaking web reality we might never be able to add \p{..} and \P{..} for Unicode properties, \X for graphemes, \N{..} for named characters, etc.

Of course, this change would also make it easier to add any from a host of special escapes in other regex libraries (such as \k<..> for named backreferences) or new ES inventions. It's really ugly that such features might not be able to be added by default everywhere, but them's the breaks, I suppose (I hope I'm wrong).

We could go crazy and start fixing all of ES's RegExp warts when /u is applied, even though such changes would not be related to Unicode support. I'd be happy to pursue that, but I suspect many here would see it as a bridge too far.

Thoughts?

-- Steven Levithan

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to