Le 26 août 2014 à 20:15, Mathias Bynens <[email protected]> a écrit :
> On 26 Aug 2014, at 19:01, Allen Wirfs-Brock <[email protected]> wrote: > >> I've thought about this a bit. I was initially inclined to agree with the >> idea of extending the existing character classes similar to what Mathias' >> proposes. But I now think that is probably not a very good idea and that >> what is currently spec'ed (essentially that the /u flag doesn't change the >> meaning of \w, \d, etc.) is the better path. […] It seems to me, that we >> want programmers to start migrating to full Unicode regular expressions >> without having to do major logic rewrite of their code. For example, >> ideally the above expression could simply be replaced by >> `parseInt(/\s*(\d+)/u.exec(input)[1])` and everything in the application >> could continue to work unchanged. > > I see your point, but I disagree with the notion that we must absolutely > maintain backwards compatibility in this case. The fact that the new flag is > opt-in gives us an opportunity to improve behavior without obsessing about > back-compat, similar to how the strict mode opt-in is used to make all sorts > of things better. When [evangelizing > `/u`](https://mathiasbynens.be/notes/es6-unicode-regex), we can educate > developers and tell them to not blindly/needlessly add `/u` to their existing > regular expressions. > >> Instead, we should leave the definitions of \d, \w and \s unchanged and plan >> to adopt the already established convention that `\p{<Unicode property>}` is >> the notation for matching Unicode categories. See >> http://www.regular-expressions.info/unicode.html > > We could do both: improve `\d` and `\w` now, and add `\p{property}` and > `\P{property}` later. Anyhow, I’ve filed > https://bugs.ecmascript.org/show_bug.cgi?id=3157 for reserving > `\p{…}`/`\P{…}`. The meaning of `\d` should not be changed; it is routinely used as a synonym of `[0-9]`. Changing its meaning is willfully introducing traps in the language, and it *will* produce bugs, for very little gain. It is much safer to learn to use `\pN` in the rare situations where one want to match numerical characters in any script. For `\w` and `\b`, on the other hand, it can be corrected, because nobody would normally consider that there is two word boundaries in the middle of "fiancée", and it is not a useful semantics, especially in Unicode-aware contexts (that is, in situations where you should use the `u` flag). —Claude _______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

