Eric Corry wrote:
Disagree with adding /u for this purpose and disagree with breaking
backward
compatibility to let `/./.exec(s)[0].length == 2`.
Care to enlighten us with any thinking behind this disagreeing?
Sorry for the rushed and overly ebullient message. I disagreed with /u for
switching from code unit to code point mode because in the moment I didn't
think a code point mode necessary or particularly beneficial. Upon further
reflection, I rushed into this opinion and will be more closely examining
the related issues.
I further objected because I think the /u flag would be better used as a
ASCII/Unicode mode switcher for \d\w\b. My proposal for this is based on
Python's re.UNICODE or (?u) flag, which does the same thing except that it
also covers \s (which is already Unicode-based in ES). Therefore, I think
that if a flag is added that only switches from code unit to code point
mode, it should not be "u". Presumably, flag /u could simultaneously affect
\d\w\b and switch to code point mode. I haven't yet thought enough about
combining these two proposals to hold a strong opinion on the matter.
there are two ways to match any Unicode
grapheme that match existing regex library precedent:
From Perl and PCRE:
\X
This doesn't work inside []. Were you envisioning the same restriction in
JS?
Also it matches a grapheme cluster, which is may be useful but is
completely different to what the dot does.
You are of course correct. And yes, I was envisioning the same restriction
within character classes. But I'm not a strong proponent of \X, especially
if support for Unicode categories is added.
I agree with Steven that these two cases should just be left alone,
which means they will continue to work the way they have until now.
Glad to hear it.
You seem to be confusing graphemes and unicode code points.
[...]
The proposal you are responding to is all about adding Unicode code
point handling to regexps. It is not about adding grapheme support,
which is a rather different issue.
Indeed. My response was rushed and poorly formed. My apologies.
--Steven Levithan
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss