On Thu Aug 27 16:37:21 2015, raiph wrote:
> jnthn++ and others are busy with work that is far more important and
> urgent than dealing with this right now. I'm filing this bug now
> because there are reasons to consider addressing it before christmas
> as explained below.
> 
> What I did
> ==========
> 
> say "नि".chars
> 
> What I expected
> ===============
> 
> 1
> 
> What I got
> ==========
> 
> 2
> 
> -----------------
> 
> Some reasons why I think it's appropriate to classify नि as a single
> grapheme:
> 
> 1. It's the last of 4 sample single graphemes in the "Extended
> Grapheme Clusters" section of the Unicode Standard Annex #29 on Text
> Segmentation: http://www.unicode.org/reports/tr29/tr29-
> 27.html#Table_Sample_Grapheme_Clusters
> 
> (The Unicode standard suggests aiming at Extended Grapheme Clusters at
> a minimum if an implementation wishes to deal with grapheme clusters.)
> 
> 2. It's the first example in S15:
> https://raw.githubusercontent.com/perl6/specs/master/S15-unicode.pod
> 
> 3. It behaves as a single unit for selection in my browser. (You too?)
> 
> --------
> 

Our NFG algorithm has now been aligned with the definition of graphemes 
provided in Unicode Standard Annex #29. The Unicode database provides a test 
suite, which has been incorporated into the spectests in 
S15-nfg/grapheme-break.t (over 400 tests, all passing).

Thanks,

/jnthn

Reply via email to