toNFC(0061 0305 0315 0300 05AE 0062) ->

>From 
>DerivedCombiningClass.txt<http://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedCombiningClass.txt>:

  05D0..05EA    ; 0 # Lo  [27] HEBREW LETTER ALEF..HEBREW LETTER TAV

In other words, 05EA with combining class 0 is blocking the
composition and any reordering between

  (0061 0305 0315 0300) on one side, and

  (0062) on the other side (which is also combining class 0).

So you will effectively get the composition of 0061 and 0305 (because
it is also no specifically excluded from composition in
CompositionExclusions.txt
<http://www.unicode.org/Public/UCD/latest/ucd/CompositionExclusions.txt>)
in:

  toNFC(0061 0305 0315 0300 05AE 0062),

but NOT in:

  toNFC(0061 05AE 0305 0315 0300 0062).

I think you have mixed the two separate test cases.


The first thing to check is to break sequences before every character with
combining class 0 (even if it is "combining", like here the Hebrew accent
zinor).

2014-03-10 19:34 GMT+01:00 Markus Doppelbauer <[email protected]>:

> Hello,
>
> I am working on an Unicode Normalization implemenation. I have a question
> about a specific toNFC test rule.
>
>  toNFC(0061 0305 0315 0300 05AE 0062) =>
>      (0061 05AE 0305 0300 0315 0062)
> expected:
>      (0061 05AE 0305 0300 0315 0062)
>         \-------------/  =>
>      (00E0 05AE 0305      0315 0062)
>
> Why doesn't 0061 and 0300 combine to 00E0 ?
>
>  Thanks a lot
> Markus
>
>
> _______________________________________________
> Unicode mailing list
> [email protected]
> http://unicode.org/mailman/listinfo/unicode
>
>
_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode

Reply via email to