Another encoding, standardized for much longer, is what IMAP uses for mailbox names. I think it does
not have a standard charset name, but it's described in one of the IMAP RFCs. It's a modified UTF-7,
modified to make it filename-friendly and deterministic, and may fit the bill. It's certainly useful
to squeeze Unicode filenames into ASCII. I am sure there are many libraries with an implementation
(ICU has one, see convrtrs.txt).
By the way, ICU also implements IDNA and generic StringPrep.
Best regards,
markus
http://oss.software.ibm.com/cgi-bin/icu/convexp?conv=IMAP-mailbox-name&b=&s=ALL