On Thu, Nov 17, 2011 at 7:45 AM, Nick Wellnhofer <[email protected]> wrote: > On 17/11/2011 13:37, Robert Muir wrote: >> >> The point of the derived property is that there are sneaky >> interactions between these. > > Having a look at the utf8proc code, the function utf8proc_decompose_char > calls itself recursively when substituting characters. So it looks like it > does support NFKC_Casefold properly.
yeah, the problematic ones can be seen here: http://www.unicode.org/Public/5.0.0/ucd/DerivedNormalizationProps.txt # Derived Property: FC_NFKC_Closure # Generated from computing: b = NFKC(Fold(a)); c = NFKC(Fold(b)); # Then if (c != b) add the mapping from a to c to the set of # mappings that constitute the FC_NFKC_Closure list So from what I can tell at a glance: with the utf8proc algorithm, if you specify NFKC and casefolding, its not yet 'done' -- lucidimagination.com
