On Thu Aug 27 16:37:21 2015, raiph wrote: > jnthn++ and others are busy with work that is far more important and > urgent than dealing with this right now. I'm filing this bug now > because there are reasons to consider addressing it before christmas > as explained below. > > What I did > ========== > > say "नि".chars > > What I expected > =============== > > 1 > > What I got > ========== > > 2 > > ----------------- > > Some reasons why I think it's appropriate to classify नि as a single > grapheme: > > 1. It's the last of 4 sample single graphemes in the "Extended > Grapheme Clusters" section of the Unicode Standard Annex #29 on Text > Segmentation: http://www.unicode.org/reports/tr29/tr29- > 27.html#Table_Sample_Grapheme_Clusters > > (The Unicode standard suggests aiming at Extended Grapheme Clusters at > a minimum if an implementation wishes to deal with grapheme clusters.) > > 2. It's the first example in S15: > https://raw.githubusercontent.com/perl6/specs/master/S15-unicode.pod > > 3. It behaves as a single unit for selection in my browser. (You too?) > > -------- >
Our NFG algorithm has now been aligned with the definition of graphemes provided in Unicode Standard Annex #29. The Unicode database provides a test suite, which has been incorporated into the spectests in S15-nfg/grapheme-break.t (over 400 tests, all passing). Thanks, /jnthn