On Fri, 21 Feb 2003, Russ Allbery wrote: > We know that some existing software will work with > UTF-8 newsgroup names out of the box without modification, although it > will require some tweaking for ideal operation.
Step back from the tree to see the forest. "Some tweaking" includes such small details as not assuming that one byte means one screen space, which has all sorts of implications in GUI programs and screen editors. Don't forget that you don't just have variable length characters in UTF-8, but you also have variable width characters in Unicode, plus characters which interact with other characters in interesting ways. A bit more than a "tweak". Stepping back further is the realization that it won't work in any reasonable way if the program doing the actual display and user input (whether terminal emulator or the news reading application itself) is not configured for UTF-8. > By comparison, > punycode (C) we know won't work correctly with *any* existing software; > the only reason why that column is a D instead of Y is that users can > use the funny-looking encoded names and still participate in the > groups. Just as "some tweaking" was understated, "won't work" is overstated here. Punycode names will work perfectly well as ASCII names. If the program doing the actual display and use input is not cognizant of punycode, it will fall back on ASCII display/input that will work in the same way around the world. The wildmat problem is a red herring. Wildmat implementations need to be cognizant of Unicode in far more substantial ways than merely overcoming punycode issues. A well-thought-out stringprep requirement will help some, but then the stringprep has to be implemented. -- Mark -- http://staff.washington.edu/mrc Science does not emerge from voting, party politics, or public debate.
