On Nov 1, 2013, at 4:37 PM, Jennifer Wong <[email protected]> wrote:
> I would like to ask for advice on removing accents from characters. While the > normalization process is straight forward (NFD, remove accents), it does not > take into account of special cases. For example, Danish, "å" should be mapped > to "aa", not "a". Likewise, in German, "ä" "ö" "ü" should be mapped to "ae", > "oe" and "ue" respectively, not "a", "e", "u". Are there common practices on > how to handle these special cases? Thank you. Perhaps Sean M. Burke's Unidecode! may be of interest: http://interglacial.com/tpj/22/ http://search.cpan.org/perldoc/Text::Unidecode

