[This message replies to messages from both Eric Hall and John Klensin.] "Eric A. Hall" <[EMAIL PROTECTED]> wrote:
> > I was suggesting that conversion between ASCII and non-ASCII never > > be done inside the infrastructure except possibly when it uses the > > well-known standard profile; for application-specific profiles, I > > was suggesting that conversion be done only at the edges. This > > model avoids profile-agnostic conversion; only entities that know > > the proper profile perform the conversion, which simplifies the > > security analysis. > > If the query failed at some remote point in the infrastructure (like > the authoritative zone for the owner name in question), then the > infrastructure would get whacked twice every time the application went > to ask for that data (once for the EDNS form, again for the ACE form). I guess I still wasn't clear enough. For labels that use application-specific profiles, there would not be two forms inside the infrastructure, only one form. The application would always convert to/from that form at the edges as necessary. Inside the infrastructure, dual forms would be used only for labels that use the standard profile. This adds a small burden to applications that don't use the standard profile, but it significantly simplifies the model. > This is a very serious concern and must be prevented at all costs. I think that's an exaggeration. A factor-of-two increase in overhead (which I'm not proposing, but even if I were) would not bring down the network. It would be small compared to the exponential growth of the internet. > How significant would the change be to have the application call the > appropriate profile prior to calling ToASCII, rather than having > ToASCII make the call itself? Whether Stringprep is called inside ToASCII or before ToASCII is beside the point. The real question is "How significant would the change be to allow IDNA to use more than one profile?" It would be a fundamental change that greatly complicates the model. Here are some more examples of the complications: What happens when two labels that use different profiles are compared? They might be identical in their UTF-8 forms but different in their ASCII forms, so you might get a different answer depending on which form they're in. Do we need to prohibit the comparison of labels that use different profiles? Old software cannot possibly be aware of such a prohibition. What are the security implications? How can we prohibit such comparisons and still allow DNS ANY queries, which by their nature need to compare the same query string against many names associated with many different RR types? The same issue arises when a domain label is copied from one slot to another, and the two slots call for different profiles. Does that need to be prohibited? Old software cannot be aware of the prohibition. What are the security implications? What if someone wants to associate multiple RR types with the same name, and those RR types call for different profiles? What profile should the owner name of a CNAME record use? Should there be a single profile for all CNAME records? If so, which profile? Or should the owner of a CNAME record use the same profile as the name it refers to? If so, what happens when the latter name is in another zone and its RR type gets changed without notice? What are the security implications? Is it possible to create subdomains of domains that use non-standard profiles? If so, a single domain name might need a different profile for each label. But each name is associated with only one RR type. The RR type might imply which profile to use for the first label (or the first k labels), but what about the rest? Eric, you are obviously a smart guy, and it wouldn't surprise me if you could solve all these issues, but I think the multi-profile model is too complex and too subtle to be worth its modest benefits. The single-profile model covers the common case, and applications that can't use the standard profile can still do what they need to do, as long as they do it outside the infrastructure (before converting their data types into domain names, and after converting domain names back into their data types). An analogy might help clarify that model. Each U.S. citizen typically has a social security number, which is nine digits. It is possible (and in fact trivial) to map social security numbers onto domain labels; for example, 123456789.nicemice.net. Domain labels can contain letters, but that doesn't mean social security numbers can contain letters. Domain labels and social security numbers are two entirely distinct data types. Similarly, data types like email address local parts are not domain labels, even if they are sometimes mapped onto domain labels. Just because domain labels can (soon) contain non-ASCII characters, that doesn't mean email address local parts can. RFC 2822 says they are ASCII, so they can't contain non-ASCII characters until someone defines internationalized email address local parts. IDN labels do not preserve case or non-normalized forms. If an application needs to map a data type onto a domain label, and the data type contains non-ASCII text for which mixed-case forms or non-normalized forms must be preserved, then the trivial mapping simply won't work. The application will need to design a more complex mapping to suit its needs. For example, when internationalized email address local parts are defined, that specification will have the opportunity to say how they get mapped onto domain labels. Whenever you try to use one data type to represent another data type, you might get lucky and find that it's trivial, or there might be a mismatch and you might have to do more complex mapping/encoding. That's just the way it goes. John C Klensin <[EMAIL PROTECTED]> wrote: > Either this is a DNS protocol, in which case it needs to specify > applicable RRs and fields and may reasonably specify a <foo>prep > profile --even if it can also be used for names (domain and otherwise) > in other contexts-- or it is a generic internationalization protocol, > in which case applicability to the DNS and domain names needs to be > specified somewhere else. It is neither. Domain name is a data type that is used in many many protocols, and DNS is just one of them. IDNA is a technique for enabling applications to use non-ASCII characters in domain names, regardless of which protocol/interface those domain names are traversing. IDNA is not a generic internationalization protocol; it applies only to domain names. IDNA is not an extension of DNS; it is not specific to any particular protocol. [In another note you point out several places where the IDNA spec seems to get too far into the guts of DNS. I agree with a lot of that. I personally have kept a closer watch on sections 2 through 4 (which I wrote the first draft of, and which say nothing about DNS) than on the other sections (which existed before I got here).] > (4) To what extent should out-of-band communications between > applications, which utilize strings which the applications might > construe as internationalized domain names, influence the design of > IDNA and, if so, what should the impact be? I'm not sure what you mean as "out-of-band" and "might construe". The domain names that appear in SMTP MAIL and RCPT commands are, I would say, very much in-band, and there's no "construing", they are unmistakably domain names. If we are going to enable non-ASCII characters in domain names in DNS, but not in other protocols, what's the point? AMC
