>From section 4 "Conversion operations" of draft-ietf-idn-idna-08.txt (also discussed in section 6.7):
,---- | An application converts a domain name put into an IDN-unaware slot or | displayed to a user. This section specifies the steps to perform in the | conversion, and the ToASCII and ToUnicode operations. | | The input to ToASCII or ToUnicode is a single label that is a sequence | of Unicode code points (remember that all ASCII code points are also | Unicode code points). If a domain name is represented using a character | set other than Unicode or US-ASCII, it will first need to be transcoded | to Unicode. `---- This last sentence seem to brush a practical problem under the rug. Most systems aren't Unicode based today, so in fact most systems will have to implement this unspecified transcoding. The Unicode consortium has not specified how to transform Unicode to/from legacy encodings. There are some unofficial mappings for ISO 8859-1 charsets on www.unicode.org/Public/MAPPINGS/, but even unofficial mappings for other charsets (in particular CJK) is not present. Real world scenario: My machine uses ISO-8859-1. I enter 0xB5. How is this transcoded into Unicode? U+00B5 or U+03BC? There are many similar examples. I think the third paragraph of the security consideration should more clearly express that IDNA actually is vulnerable to the attack if machines, like most machines on the Internet, use legacy encodings. Some high-level insight on the problem: http://www.cl.cam.ac.uk/~mgk25/unicode.html#conv
