Hi Jennifer, On Fri, Nov 1, 2013 at 8:37 AM, Jennifer Wong <[email protected]>wrote:
> I would like to ask for advice on removing accents from characters. > While the normalization process is straight forward (NFD, remove accents), > it does not take into account of special cases. For example, Danish, "å" > should be mapped to "aa", not "a". Likewise, in German, "ä" "ö" "ü" should > be mapped to "ae", "oe" and "ue" respectively, not "a", "e", "u". Are > there common practices on how to handle these special cases? Thank you. > Can you describe what your use case is? One possible area that appears not to have been discussed yet is sorting of strings and full-text search (as in ctrl-F in a browser or word processor). If you are after those, then please look for "unicode collation" and "cldr collation". The ICU libraries <http://userguide.icu-project.org/collation>might also help. Best regards, markus

