On Wed, Oct 6, 2010 at 8:49 AM, Mark Davis ☕ <[email protected]> wrote:
> ICU has a canonical iterator, one that provides all the strings that > produce the same result under toNFC(...). The algorithm is here: http://www.unicode.org/notes/tn5/#Enumerating_Equivalent_Strings <http://www.unicode.org/notes/tn5/#Enumerating_Equivalent_Strings>API: Search for "ICU CanonicalIterator" (without the quotes). As Mark said, this is limited to canonical equivalences (NFC/NFD) but if necessary we could extend it to arbitrary normalization forms. If you need additional support from ICU then please move this discussion there: http://site.icu-project.org/ markus

