1. String.normalize should support NFKC and NFKD unicode normalization format.
Reference: https://www.unicode.org/reports/tr15/ Those are particularly useful to generate "machine identifiers" from user input, like usernames. 2. The second part (which is independent but related), is support for unicode transliteration. Basically, this is a "non destructive" unicode->ascii conversion. There is a library doing it in elixir https://github.com/fcevado/unidecode and a javascript example https://github.com/pid/speakingurl Also some discussion on the forum: https://elixirforum.com/t/how-to-replace-accented-letters-with-ascii-letters/539/8 My thinking is that all those libraries are doing it a bit differently, because, well, unicode is hard. And with unicode being so hard, I think it should be implemented at the language level (or in a core library) to be done right and supported. It might not matters much for English readers, but for other languages, it is something you will implement eventually, often poorly. Some references: http://cldr.unicode.org/index/cldr-spec/transliteration-guidelines -- You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/d2839fb2-984c-4bcf-b8fd-c891c8c24c83%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
