[EMAIL PROTECTED] (Martin v. L�wis) writes: > Paul Hoffman / IMC <[EMAIL PROTECTED]> writes: > >> There is a new draft that is of very direct interest to this mailing >> list. It would be great if people who have implemented IDNA could >> check their results against this draft. > > Please correct me if I'm wrong: I believe the UTF-8 strings are wrong > in a number of tests: > > 4.3: U+00DF should be \xc3\x9f, not \xc3\xdf > 4.9: U+01F0 should be \xc7\xb0, not \xc7\xf0 > 4.44: U+00DF should be \xc3\x9f, not \xc3\xdf > 4.45: Likewise.
You are right. I thought I could convert simple strings by hand, but obviously I didn't select the proper UTF-8 encoding (in some cases). I believe you catched all errors. Since those examples test whether the application uses a validating UTF-8 decoder, I will keep those examples modified to result in an UTF-8 decoding error. Non-validating UTF-8 decoders should produce the correct Unicode code point for those UTF-8 encodings though.
