The comparison of loose vs loose should never be done, for the reasons you outline. That leaves
a) strict vs strict b) strict vs loose (a) is always safe. (b) works for situations like DNS, where there is a server on one side storing strict keys, and queries are made on the other and may be loose or strict. It would work for similar situations, such as email, if at one end there is a server that does validation (of, in this case, email names). Mark ————— Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο πάντα — Ὁμήρου Μαργίτῃ [For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr] http://www.macchiato.com ----- Original Message ----- From: "Adam M. Costello" <[EMAIL PROTECTED]> To: "IETF idn working group" <[EMAIL PROTECTED]> Sent: Thursday, January 31, 2002 14:03 Subject: Re: [idn] stringprep comment 2 > This message argues both sides of the issue. :) > > Soobok Lee <[EMAIL PROTECTED]> wrote: > > > The latter can be as catastrophic as the former. > > I assume you meant that false negatives (where names don't match when > they should) can be as catastrophic as false positives (where names > match when they shouldn't). Can you back up that claim? > > > if each application vendor adopts its own different nameprep profile, > > applications behaviors may be unpredictable across applications for > > end users. > > Do you have a suggestion? What should happen when an application > encounters a name that uses code points newer than the application's > version of nameprep? If the application prohibits unassigned code > points, then the name will never match anything, because ToASCII will > fail. If the application allows unassigned code points, then the name > will never match the wrong thing, and might sometimes match the right > thing (in practice, I think it usually will work). Which is preferable? > The conservative approach (never match) is more predictable, but the > other approach (match if you're lucky) might make users happier. > > Wait, I just realized why we needed to avoid comparing two strings > that have both been prepared using loose stringprep. If they both use > unassigned code points that turn out to be prohibited in future versions > of nameprep, then they might match even though they are both invalid > names. That's a false positive, which is bad. So we do indeed need to > avoid such comparisons. > > Disregard my suggestion from my last message. > > Perhaps the stringprep spec should say that applications may use loose > stringprep only if they know for sure that the name will never be > compared against a name that was also prepared using loose stringprep. > If there's no way to know, then you must use strict stringprep. > > In the case of DNS, if the IDNA spec requires authoritative servers > to use strict nameprep, then clients are free to prepare queries > using loose nameprep. Other protocols could in principle use similar > methods--requiring strict nameprep at "one end" (whatever that means for > that protocol) so that the "other end" can use loose nameprep. > > But how practical is that? Take email headers for example. Who has any > idea what will be done with domain names that appear in email headers? > > Maybe it would be a lot simpler and safer just to prohibit unassigned > code points always. If you want to use new characters, you'll just have > to upgrade your software to the new nameprep, sorry. > > Can we get some more people involved in this thread? I think Soobok is > right that the existing wording in stringprep about "stored strings" and > "query strings" is going to be very difficult to interpret in practice, > and something needs to be done about it, but I don't know what. > > AMC > >
