On Thu, Apr 2, 2015 at 1:39 AM, Andrea Giammarchi <[email protected]> wrote: > Jordan the purpose of `Array.from` is to iterate over the string, and the > point of iteration instead of splitting is to have automagically codepoints. > This, unless I've misunderstood Mathias presentation (might be) > > So, here there is a different problem: there are code-points that do not > represent real visual representation ...
Those are called grapheme clusters or just “graphemes”, as Boris mentioned. And here’s how to deal with them: https://mathiasbynens.be/notes/javascript-unicode#other-grapheme-clusters “Unicode Standard Annex #29 describes [an algorithm for determining grapheme cluster boundaries](http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries). For a _completely_ accurate solution that works for all Unicode scripts, implement this algorithm in JavaScript, and then count each grapheme cluster as a single symbol.” > or maybe, the real problem, is about broken `Array.from` polyfill? `Array.from` just uses `String.prototype[Symbol.iterator]` internally, and that is defined to deal with code points, not grapheme clusters. Either choice would have confused some developers. IIRC, Perl 6 has built-in capabilities to deal with grapheme clusters, but until ES does, this use case must be addressed in user-land. _______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

