Another encoding, standardized for much longer, is what IMAP uses for mailbox names. I think it does not have a standard charset name, but it's described in one of the IMAP RFCs. It's a modified UTF-7, modified to make it filename-friendly and deterministic, and may fit the bill. It's certainly useful to squeeze Unicode filenames into ASCII. I am sure there are many libraries with an implementation (ICU has one, see convrtrs.txt).

By the way, ICU also implements IDNA and generic StringPrep.

Best regards,
markus

http://oss.software.ibm.com/cgi-bin/icu/convexp?conv=IMAP-mailbox-name&b=&s=ALL




Reply via email to