This message contains responses to Martin Duerst, Mark Davis, and Keith Moore. Martin Duerst <[EMAIL PROTECTED]> wrote: > But if [nameprep] is supposed to be strictly applied on every > occasion, in particular every time a name is resolved, as e.g. Patrick > is describing it, these foldings may lead to people believing that > these characters are acceptable in a domain name the same way an > upper-case character is. What's wrong with actually considering them to be acceptable in the same way as uppercase characters? > 1) Characters that are very clearly visually distinct from the ones > they are mapped to. (for obvious reasons) The reason is not obvious to me. B and b are very clearly visually distinct, yet we live happily in a world where both berkeley.edu and Berkeley.EDU are commonly used, and are equivalent. > 2) Characters that completely map to ASCII-only characters (to make > sure that current applications and applications doing nameprep > behave the same way for ASCII-only). I don't see the problem, and you are apparently proposing to eliminate the mapping from full-width Latin to half-width Latin, which I think would be unfortunate. > An example of a character for which both of the above apply is > U+2460 CIRCLED DIGIT ONE. This is a simple digit 'one' in a circle. > Obviously, everybody can immediately see that it's different from > just a '1'. Also, if it's mapped to '1' by nameprep applications, > these applications will behave differently from applications not using > nameprep (i.e. everything out there now). Well of course, if you try to compare names without doing nameprep, then you will fail to see matches where you should. Of course feeding a non-ASCII hostname to an IDN-unaware application is asking for trouble. If you decide to write the domain name foo1.com using a circled 1, then obviously you are using IDN functionality and you assume the same risk as anyone using any IDN: the risk that someone will try to paste it into an IDN-unaware application. Mark Davis <[EMAIL PROTECTED]> wrote: > I think the whole notion of trying to prevent cross-script confusions > in domain names is a morass. > > Better would be to have useful GUIs that detect and signal possibly > confusing names. I agree. There's really no way to guarantee that the distinction between non-equivalent names is apparent to users. Even with a font that distinguishes 1 from l and 0 from O, people still overlook misspellings, even blatant ones like "whitehouse.com" instead of "whitehouse.gov". What nameprep can do, however, is help make sure that users are able to type the names they have in mind. So a user who wants to type <omicron><omicron> can do so, and a user who wants to type oo (Latin) can do so (regardless of whether their keyboard interface defaults to full-width or half-width). A user who configures their browser to accept cookies from oo.com (Latin) will not accidentally get cookies from the Greek look-alike. > If it is impossible to register names *except those that have been > nameprepped*, then whether or not UTF-8 names go over the wire > unprepped or not should not cause much of a security problem. Here's another way to think of it: Whenever I use any domain name for any purpose, I am trusting the name servers of that zone (and all its ancestor zones) to give out true information when that name is looked up. With IDNs, I must now also trust those servers not to give out bogus information in response to non-nameprep'd equivalent representations of that name, but my exposure has not increased--I was completely at the mercy of those servers before, and I still am. So as you say, allowing non-nameprep'd UTF-8 on the wire does not cause a security problem. Keith Moore <[EMAIL PROTECTED]> wrote: > fuzzy matching on the server side. Fuzzy matching at lookup time (as opposed to registration time) would not help with the problem of domain name spoofing (like tricking someone to follow a link to yah<omicron><omicron>.com). It would help the problem of someone seeing oo.com in print and not knowing whether it's Latin or Greek, but that problem is much rarer, usually resolved by context, and people can choose to register names that are not so confusing. I think fuzzy matching in servers would be more trouble than it's worth. AMC
