> > 2. There are no "non-Unicode coding systems" that unify beta > > and eszed; the language issue is irrelevant. > > Sure there are. We call some of them "books". Transcription of > a language into printed form involves a coding system. And I > have to assume, although I can claim no personal knowledge, that > German schoolchildren, brought up looking at Eszett, have to be > taught, when they encounter mathematical notation that uses > Greek characters (if not sooner), that it is important to notice > either the context or the descender -- that the two characters > are not the same.
My point was that they probably have about the same level of confusion that you had when you first saw a gamma as a schoolchild and confused it with a y. > These distinctions, including getting used to > the variations and similarities in different fonts of I-l-1, are > bits of pattern recogition that lay people --as distinct from > font or character set experts-- rapidly learn, within their own > language and script contexts, to distinguish from context or by > relatively subtle clues. I can't even spell out Arabic or Thai > scripts because I don't have enough experience with the right > set of clues -- my loss, but these are learned skills. True > > But this isn't the point, so whether there are, or are not, > coded character sets that unify the two is not the point either > (I'll defer to your knowledge and experience on this subject, > since I haven't studied the question, but statements that sound > like universal negatives always scare me). Let me qualify my negative: I have never heard of any that do, and of the 750 odd code page mapping tables that we have collected on major platforms (http://oss.software.ibm.com/cvs/icu/charset/data/xml/), none of them do. Of course, if you go down to Arkansas Bob's "Bait, Tackle, and Character Encodings Shack", he can whip up a nice character set in no time flat that'll unify them. You should understand my phrase 'no "non-Unicode coding systems"' as meaning 'no "non-Unicode coding systems" of any importance or impact', and I'll now understand your phrase "non-Unicode coding systems" as meaning "books" ;-) .. > (ii) It is addressed to, and solves, a very narrow problem. We > (for some definition of "we") have not been explicit, in an > Internet context, as to what that problem is. I believe that we > should be explicit. Then, having carefully described that > problem, we then need to carefully evaluate the question of > whether the benefits of solving it outweigh the risks to the use > of the DNS in the Internet community that it might pose. If we > conclude that we can't reasonably do that evaluation (e.g., > because it isn't an IETF problem), then I think we are still > obligated to delineate the issues and risks to the best of our > ability -- at least to the extent of writing down the > implications of problems and issues we already know about. > > (iii) A number of items of knowledge and recommendations have > surfaced in the working group -- of which your suggestion above > is an excellent example -- that could be used to reduce or > eliminate some of those risks to the DNS as a piece of usable > Internet infrastructure. I think they need to be written down > as part of WG output, if only because "this risk can be > ameliorated if one does so-and-so" is a much more satisfactory > statement than "there is this horrible problem and we should > consider stopping progress until someone has a solution". I agree that in both of these cases capturing additional information as guidelines for dealing with particular issues would be very helpful. > Of course, if "mixed scripts in domain names" are considered > good things, warning when they occur won't help much. But > _that_ one, I would contend, is not an IETF problem although I > think it would be wise and responsible for us to point out that > mixed script labels pose challenges that homogeneous ones do not. I agree. Such UI's would certainly alert people to something very fishy going on in the case of "Intel.com" spelled with a Cyrillic 'e', without preventing legitimate registrations such as "ABC<alpha><beta><gamma>.com". For the latter, the visual indication would be there, and people would be alerted to the multiple scripts, but it would not prevent usage. Regard, Mark
