On Thu, Oct 24, 2013 at 7:38 AM, Anne van Kesteren <[email protected]> wrote: > On Thu, Oct 24, 2013 at 3:31 PM, Mathias Bynens <[email protected]> wrote: >> Imagine you’re writing a JavaScript library that escapes a given string as >> an HTML character reference, or as a CSS identifier, or anything else. In >> those cases, you don’t care about grapheme clusters, you care about code >> points, cause those are the units you end up escaping individually. > > Is that really a common operation? I would expect formatting, > searching, etc. to dominate. E.g. whenever you do substr/substring you > would want that to be grapheme-cluster aware.
I think I disagree. Trying to take this apart: If you're searching, you don't want to use the iterator anyway, because finding character boundaries or grapheme boundaries is a waste of time. UTF-16 is designed so that you can search based on code units alone, without computing boundaries. RegExp searches fall in this category. IIUC, "formatting" mostly involves finding patterns to replace—it's a special case of searching, right? When you do substr/slice/substring, you should be using offsets that are on grapheme boundaries, but obtaining offsets by using String iteration and adding up the lengths will be very rare, I think. So String iteration is kind of left looking around for a use case. I can't think of any that compel me to prefer graphemes over characters out of sheer practicality. Reversing strings, for example, I can't care about that. Anyone? -j _______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

