On Oct 26, 2013, at 6:58 , Jason Orendorff <[email protected]> wrote:
> On Fri, Oct 25, 2013 at 11:42 PM, Norbert Lindenberg > <[email protected]> wrote: >> >> On Oct 25, 2013, at 18:35 , Jason Orendorff <[email protected]> >> wrote: >> >>> UTF-16 is designed so that you can search based on code units >>> alone, without computing boundaries. RegExp searches fall in this >>> category. >> >> Not if the RegExp is case insensitive, or uses a character class, or ".", or >> a quantifier - these all require looking at code points rather than UTF-16 >> code units in order to support the full Unicode character set. > I'd like to know what you have in mind regarding quantifiers though. When I write /💩{2}/, I mean /💩💩/, but the current code unit based RegExp will interpret it as /💩\uDCA9/, which can't match any well-formed UTF-16 string. Norbert _______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

