As send by Keith Moore. We will also speak to the list moderator on the policy of [EMAIL PROTECTED] mailing list. Please be patient while we work this out. Thanks. -James Seng > To: [EMAIL PROTECTED] > Subject: comments on draft-ietf-idn-requirements-03.txt > cc: [EMAIL PROTECTED] > From: Keith Moore <[EMAIL PROTECTED]> > Date: Mon, 31 Jul 2000 16:55:36 -0400 > Sender: [EMAIL PROTECTED] > > > 1. Introduction > > > > At present, the encoding of Internet domain names is restricted to a > > subset of 7-bit ASCII (ISO/IEC 646). HTML, XML, IMAP, FTP, and many > > other text based items on the Internet have already been at least > > partially internationalized. It is important for domain names to be > > similarly internationalized or for an equivalent solution to be found. > > This document assumes that the most effective solution involves putting > > non-ASCII names inside some parts of the overall DNS system. > > Since you've made that assumption, it is of course good to state it. > However, this is not an appropriate constraint to impose on a solution > to the IDN problem. The IDN problem is being investigated because of > the needs of human users, not because of any requirement at the DNS > protocol level. Users don't care about the bits on the wire that are > transmitted to and from DNS; they care about the interchangability of > DNS names at the application layer *and above*. Focusing attention > on the DNS layer diverts attention from the most thorny parts of the > IDN problem - the user interfaces to applications where DNS names > are entered and displayed, and the interfaces between different > applications where DNS names are exchanged. > > For any IDN work to effective may require changes to a many different > pieces - DNS servers, operating systems, software libraries, applications > protocols, and user interfaces - on many different platforms. These > pieces *must* be able to change independently from one another, and > everything that was once working *must* be able to keep on working. > This means that normal use of ASCII-only domain names must keep working; > it also means that once use of IDNs starts working for some set of > components, it should not cease to work when one of those components > is upgraded. > > This process will likely take many years, and it must be realized that > IDN will not be universally available during that transition period. > > > 1.1 Definitions and Conventions > > > > The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", > > "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this > > document are to be interpreted as described in [RFC2119]. > > The 2119 definitions seem inappropriate for a requirements document. > > > 1.4 A multilayer model of the DNS function > > > > The DNS can be seen as a multilayer function: > > > > - The bottom layer is where the packets are passed across the Internet > > in a DNS query and a DNS response. At this level, what matters is > > the format and meaning of bits and octets in a DNS packet. > > > > - Above that is the "DNS service", created by an infrastructure of DNS > > servers, NS records that point to those DNS servers, that is > > pointed to by the root servers (listed in the "root cache file" on each DNS > > server, often called "named.cache". It is at this level that the > > statement "the DNS has a single root" [RFC2826] makes sense, but > > still, what are being transferred are octets, not characters. > > > > - Interfacing to the user is a service layer, often called "the resolver > > library", and often embedded in the operating system or system > > libraries of the client machines. It is at the top of this layer that > > the API calls commonly known as "gethostbyname" and "gethostbyaddress" > > reside. These calls are modified to support IPv6 [RFC2553]. A > > conceptually similar layer exists in authoritative DNS servers, > > comprising the parts that generate "meaningful" strings in DNS files. > > Due to the popularity of the "master file" format, this layer often > > exists only in the administrative routines of the service maintainers. > > I'm not at all sure what is meant by "administrative routines of the service > maintainers"... could you reword or elaborate? > > > - The user of this layer (resolver library) is the application programs > > that use the DNS, such as mailers, mail servers, Web clients, Web > > servers, Web caches, IRC clients, FTP clients, distributed file > > systems, distributed databases, and almost all other applications on > > TCP/IP. > > There are several more layers of interest, and they are the layers > which are most impacted by IDN changes. As such they really need > to be considered in your taxonomy. > > - - the input routines used by applications. often these are supplied by, > or part of, the operating system which supports the applications. > These layers may or may not suppport internationalization, or they > may support "localization". different instances of the same platform > may use different CESs to represent characters. > > - - the display routines used by applications. similar issues apply as for > input methods. however, it cannot be assumed that the input routines > and the display routines will change together or that they will support > the same formats for representation of IDNs - in general, this will > not be the case. > > - - the means used by applications to exchange DNS names with one another > without intervening human input - for instance, drag-and-drop, or > text-based cut-and-paste. > > - - the human users which need to be able to read, transcribe, and > input IDNs. arguably their needs include not only reading and > typing but also spoken input and audible output. > > - - the means used by human users to exchange domain names with one another.. > > > > > Graphically, one can illustrate it like this: > > > > +---------------+ +---------------------+ > > | Application | | (Base data) | > > +---------------+ +---------------------+ > > | Application service interface | > > | For ex. GethostbyXXXX interface | (no standard) > > +---------------+ +---------------------+ > > | Resolver | | Auth DNS server | > > +---------------+ +---------------------+ > > | <----- DNS service interface -----> | > > +------------------------------------------------------------------+ > > | DNS service | > > | +-----------------------+ +--------------------+ | > > | | Forwarding DNS server | | Caching DNS server | | > > | +-----------------------+ +--------------------+ | > > | | > > | +-------------------------+ | > > | | Parent-zone DNS servers | | > > | +-------------------------+ | > > | | > > | +-------------------------+ | > > | | Root DNS servers | | > > | +-------------------------+ | > > | | > > +------------------------------------------------------------------+ > > The picture omits the interaction between applications and users, > and between multiple applications. these could be illustrated in > another picture: > > (spoken / heard) > human 1 <------------------------------------------> human 2 > | ^ \ / ^ | > | | \ (written) (read) / | | > | | ----------------> paper >----------------- | | > | | | | > v | | v > +-------------------+ (cut and paste) +-------------------+ > | host text input |<-------------------------->| host text input | > | and output | | and output | > +-------------------+ +-------------------+ > | ^ ^ | > v | direct | v > +---------------+ (interface between applications) +---------------+ > | Application 1 |<---------------------------------->| Application 2 | > +---------------+ +---------------+ > > while it's generally true that most of these interfaces cannot be > changed or specified by IETF, they are likely to be affected - > and it is important to consider the effects on these interfaces > when evaluating a proposal for interoperability. I claim that > these interfaces are the ones which are most important. > > > 1.5 Service model of the DNS > > > > The Domain Name Service is used for multiple purposes, each of which is > > characterized by what it puts into the system (the query) and what it > > expects as a result (the reply). > > > > The most used ones in the current DNS are: > > > > - Hostname-to-address service (A, AAAA, A6): Enter a hostname, and get > > back an IPv4 or IPv6 address. > > > > - Hostname-to-Mail server service (MX): As above, but the expected > > return value is a hostname and a priority for SMTP servers. > > > > - Address-to-hostname service (PTR): Enter an IPv4 or IPv6 address (in > > in-addr.arpa or ip6.int form respectively) and get back a hostname. > > > > - Domain delegation service (NS). Enter a domain name and get back > > nameserver records (designated hosts who provides authoritive > > nameservice) for the domain. > > > > New services are being defined, either as entirely new services (IPv6 to > > hostname mapping using binary labels) or as embellishments to other > > services (DNSSEC returning information about whether a given DNS service > > is performed securely or not). > > > > These services exist, conceptually, at the Application/Resolver > > interface, NOT at the DNS-service interface. > > I'm not sure what this statement means or what it implies. It is > not immediately clear to me that the services listed above are > transparent to lower layers. NATs in particular make assumptions > about the semantics of address lookups and inverse adderss lookups. > DNS servers also treat different kinds of queries 'specially' in > that they return different 'additional information' depending on > the query type. > > > This document attempts to > > set requirements for an equivalent of the "used services" given above, > > where "hostname" is replaced by "Internationalized Domain Name". This > > doesn't preclude the fact that IDN should work with any kind of DNS > > queries. IDN is a new service. Since existing protocols like SMTP or > > HTTP use the old service, it is a matter of great concern how the new > > and old services work together, and how other protocols can take > > advantage of the new service. > > the last point could use some more elaboration or emphasis. > perhaps it would help just to put it in a separate paragraph? > > > 2.1 Compatibility and Interoperability > > > > [1] The DNS is essential to the entire Internet. Therefore, the service > > MUST NOT damage present DNS protocol interoperability. > > good. > > > It MUST make the > > minimum number of changes to existing protocols on all layers of the > > stack. > > I wouldn't state this as a requirement. rather, I would state that > the requirement is that the IDN system be incrementally deployable > with minimum disruption to operational services. a system that met > these requirements would be superior to one which required only few > changes but which was quite disruptive to deploy or which required > massive simultaneous deployment. (a 'flag day') > > > It MUST continue to allow any system anywhere to resolve any > > internationalized domain name. > > not sure what this means. in general, systems cannot resolve IDNs now, > since they are not even defined yet. what does it mean for them to > 'continue' to be able to do so? if the goal is to maintain backward > compatibility with pre-standard IDN systems, this should be a separate > goal and it should be stated more clearly and less strongly. of course > it is highly desirable to make the transition an easy one but this > concern should not be paramount. > > > [2] The service MUST preserve the basic concept and facilities of domain > > names as described in [RFC1034]. It MUST maintain a single, global, > > universal, and consistent hierarchical namespace. > > yes. > > > [2.5] The DNS service layer (the packet formats that go on the wire) > > MUST NOT limit the codepoints that can be used. This interface SHOULD > > NOT assign meaning to name strings; the application service layer, > > where "gethostbyname" et al reside, MAY constrain the name strings to > > be used in certain services. (conflict) > > I don't disagree with the goal, but neither do I see how the desire > to implement IDNs imposes this as a requirement, or how failure to > meet this requirement would be disruptive to DNS. This appears to > over-constrain the solution set. > > if one were defining a new service on DNS, it is at least worth > considering to use a reserved codepoint in a DNS label to indicate > an IDN label (as opposed to an ASCII label). > > > [3] The same name resolution request MUST generate the same response, > > regardless of the location or localization settings in the resolver, in > > the master server, and in any slave servers involved in the resolution > > process. > > "MUST generate the same response" modulo error conditions - obviously > if a slave doesn't have current data and/or is disconnected from the net > it will not generate the same response as the master server. what you > don't want is for two different servers to generate conflicting responses. > (either one can report an error, either one can have stale data as long > as it's within the TTL, but one server should not "successfully" return > X while the other "successfully" returns Y). > > the other caveat is that this applies to any combination of "new" > (upgraded to support IDN) and "old" (not upgraded) servers. > > > [4] The protocol SHOULD allow creation of caching servers that do > > not understand the charset in which a request or response is encoded. > > The caching server SHOULD perform correctly for IDN as well as for > > current domain names (without the authoritative bit) as the master > > server would have if presented with the same request. > > this presumes that the request or responses will be charset-tagged; > not necessarily a good idea. > > but I would state this more strongly - the protocol should allow > existing DNS resolvers and caches to do reasonable things with > IDN RRs. you can expect IDN users to upgrade authoritative servers > when they start using IDNs but you can't reasonably expect every cache > to get upgraded to use IDNs before people start using them. > > > [5] A caching server MUST NOT return data in response to a query that > > would not have been returned if the same query had been presented to an > > authoritative server. This applies fully for the cases when: > > > > - The caching server does not know about IDN > > - The caching server implements the whole specification > > - The caching server implements a valid subset of the specification > > [6] missing? > > > [7] The service MAY modify the DNS protocol [RFC1035] and other related > > work undertaken by the [DNSEXT] WG. However, these changes SHOULD be as > > small as possible and any changes SHOULD be coordinated with the > > [DNSEXT] WG. > > strongly recommend that any changes should be made by DNSEXT, or by a > group chartered by IESG for this purpose, not by this group. > this group can write requirements and perhaps eventually architecture > specification, not do the protocol design. > > > [8] The protocol supporting the service SHOULD be as simple as possible > > from the user's perspective. Ideally, users SHOULD NOT realize that IDN > > was added on to the existing DNS. > > I don't agree with this as stated. Users will almost certainly have > to be aware of IDN at some level, if only so that they can realize > when they can and cannot use an IDN. (they won't be able to use them > everywhere immediately) > > In general, users don't care about the DNS protocol now, and shouldn't > have to with IDN. Users who maintain master files will have to care > about it to some degree; it will probably make their lives slightly more > complicated. > > I think this needs rewording or clarification. It's trying > to take a statement about user needs and make a conclusion about > DNS protocol complexity. Simple protocols are generally good, but > to state this as a requirement is not justified. > > > [10] The best solution is one that maintains maximum feasible > > compatibility with current DNS standards as long as it meets the other > > requirements in this document. > > I do not accept the new requirements as more important than maximum > feasible compatibility. The best solution is one that (a) provides > effective IDN in the long term and (b) is universally deployable > (or nearly so) without much disruption in a manner that does not > fragment DNS space. > > > 2.2 Internationalization > > > > [11] Internationalized characters MUST be allowed to be represented and > > used in DNS names and records. The protocol MUST specify what charset is > > used when resolving domain names and how characters are encoded in DNS > > records. > > this could be taken two ways - certainly you want the CES to be unambiguous. > but this statement could be read to require that the protocol support > charset tagging, and I don't think that follows. (nor do I think that > is what you meant, given the following statement) > > > [12] This document RECOMMENDS Unicode only. If multiple charsets are > > allowed, each charset MUST be tagged and conform to [RFC2277]. > > > > [12.5] IDN MUST NOT return illegal code points in responses, SHOULD > > reject queries with illegal codepoints. (one request to add; one request > > to remove) > > what is an illegal codepoint? > does this mean "illegal codepoint" as defined by unicode? > > > [13] CES(s) chosen SHOULD NOT encode ASCII characters differently > > depending on the other characters in the string. In other words, unless > > IDN names are identified and coded differently from ASCII-only ones, > > characters in the ASCII set SHOULD remain as specified in [US-ASCII] > > (one request to remove). > > I don't think this is stated right. I think the requirement is to maintain > DNS protocol compatibility at all levels for ASCII names and the current > query and response types. but if the DNS protocol were extended with > different query types or options or whatever, those extensions could have > a different representation for ASCII names. > > > [14] The protocol SHOULD NOT invent a new CCS for the purpose of IDN > > only and SHOULD use existing CES. The charset(s) chosen SHOULD also be > > non-ambiguous. > > not clear where these requirements come from - it appears to overconstrain > the solution space. ideally of course, we wouldn't need to invent a > new CCS or CES, but there might be advantages in at least doing a different > CES. > > the word 'non-ambiguous' is ambiguous. > non-ambiguous in what way? > > > [15] The protocol SHOULD NOT make any assumptions about the location in > > a domain name where internationalization might appear. In other words, > > it SHOULD NOT differentiate between any part of a domain name because > > this MAY impose restrictions on future internationalization efforts. > > good. (though it's not appropriate to use capitalized MAY here.) > > > [16] The protocol also SHOULD NOT make any localized restrictions in the > > protocol. For example, an IDN implementation which only allows domain > > names to use a single local script would immediately restrict > > multinational organization. > > I would state this more broadly - interpretation of an IDN MUST be the > same from any point on the Internet which supports IDNs. > > (of course, this doesn't imply that an Italian keyboard must be able > to input Korean IDNs...) > > > [17] While there are a wide range of devices that use the DNS and a wide > > range of characteristics of international scripts and methods of > > domain name input and display, IDN is only concerned with the > > protocol. Therefore, there MUST be a single way of encoding an > > internationalized domain name within the DNS. > > this says two different things. I agree with the latter statement, but > I don't agree than IDN can only be concerned with the protocol. This > misses the real difficulty in implementing IDNs. Now it's true that > all that IETF can specify is the DNS protocol, but the design must take > higher layers into consideration if it is to be successful. > > > [18] The protocol SHOULD NOT place any restrictions on the > > application service layer. It SHOULD only specify changes in the DNS > > service layer and within the DNS itself. > > again, this is misleading. IDN is going to have to at least make > some assumptions about higher levels (which end up being constraints) > if it is to be successful. > > > 2.4 Canonicalization > > > > Matching rules are a complicated process for IDN. Canonicalization > > of characters MUST follow precise and predictable rules to ensure > > consistency. [CHARREQ] is RECOMMENDED as a guide on canonicalization. > > > > The DNS has to match a host name in a request with a host name held > > in one or more zones. It also needs to sort names into order. It is > > expected that some sort of canonicalization algorithm will be used as > > the first step of this process. This section discusses some of the > > properties which will be REQUIRED of that algorithm. > > > > [22] To achieve interoperability, canonicalization MUST be done at a > > single well-defined place in the DNS resolution process. > > not clear. the ultimate solution is likely to employ several kinds of > canonicalization. e.g. > > - - conversion of client-specific charset (from input) into Unicode > - - canonicalization of Unicode to get a unique on-the-wire representation > - - locale-specific canonicalization - to meet the requirements of > a specific locale within a DNS zone. > > these might each need to be done at a different place. > > > The protocol > > MUST specify canonicalization; it MUST specify exactly where in the > > DNS that canonicalization happens and does not happen; it MUST specify > > how additions to ISO 10646 will affect the stability of the DNS and > > the amount of work done on the root DNS servers. > > > > [23] The canonicalization algorithm MAY specify operations for case, > > ligature, and punctuation folding. > > and these might be locale-specific. > > > [24] In order to retain backwards compatibility with the current DNS, > > the service MUST retain the case-insensitive comparison for [US-ASCII] > > as specified in [RFC1035]. For example, Latin capital letter A (U+0041) > > MUST match Latin small letter a (U+0061). [UTR21] describes some of > > the issues with case mapping. Case-insensitivity for non [US-ASCII] > > MUST be discussed in the protocol proposal. > > > > [25] Case folding MUST be locale independent. For example, Latin > > capital letter I (U+0049) case folded to lower case in the Turkish > > context will become Latin small letter dotless i (U+0131). But in the > > English context, it will become Latin small letter i (U+0069). > > not clear whether you can make this work. given that case folding > is implemented in current DNS servers - don't see why you cannot > do locale-dependent case folding in IDN servers. however you want > to be careful that caches do not apply locale-specific case folding. > > > [26] If other canonicalization is done, it MUST be done before the > > domain name is resolved. Further, the canonicalization MUST be easily > > upgradable as new languages and writing systems are added. > > not clear that this is appropriate - the obvious place to do locale- > specific canonicalization is on the authoritative servers for that zone. > > > [27] Any conversion (case, ligature folding, punctuation folding, etc) > > from what the user enters into a client to what the client asks for > > resolution MUST be done identically on any request from any client. > > .. for a particular zone. not clear that you want to make the same > constraints apply across all zones. but in general every query should > get the same result (if successful) from any place on the internet. > and this applies to a lot more aspects of the system than just > canonicalization. > > also, if client A uses charset X and client B uses charset Y, it's > hard to impose the constraint that the conversions should be the same - > there will necessarily be a client-specific conversion between X or > Y and the charset used by IDN. > > > [30] If the charset can be normalized, then it SHOULD be normalized > > before it is used in IDN. Normalization SHOULD follow [UTR15]. > > (conflict) > > > > [31] The protocol SHOULD avoid inventing a new normalization form > > provided a technically sufficient one is available. > > > > 2.5 Operational Issues > > > > [32] Zone files SHOULD remain easily editable. > > by what kinds of tools? do these tools need to use the same CCS > and CES as IDN uses? > > > [33] An IDN-capable resolver or server SHALL NOT generate more traffic > > than a non-IDN-capable resolver or server would when resolving an > > ASCII-only domain name. The amount of traffic generated when resolving > > an IDN SHALL be similar to that generated when resolving an ASCII-only > > name. > > SHALL NOT seems too strong; a modest increase in traffic (due to > larger storage requirements for labels, for instance) should > be acceptable. the latter statement is better, and seems sufficient > by itself. > > > [34] The service SHOULD NOT add new centralized administration for the > > DNS. A domain administrator SHOULD be able to create internationalized > > names as easily as adding current domain names. > > mumble. with similar ease or difficulty, not "as easily". > > > [35] Within a single zone, the zone manager MUST be able to define > > equivalence rules that suit the purpose of the zone, such as, but not > > limited to, and not necessarily, non-ASCII case folding, Unicode > > normalizations (if Unicode is chosen), Cyrillic/Greek/Latin folding, or > > traditional/simplified Chinese equivalence. Such defined equivalences > > MUST NOT remove equivalences that are assumed by (old or > > local-rule-ignorant) caches. > > you need to distinguish zone-imposed equivalence rules from protocol-imposed > equivalence rules. they will probably get handled differently. > > caches that, for whatever reason, assume equivalence rules other than > those imposed by the protocol, will probably break things. > to expect them not to do so is not a reasonable requirement as stated. > > - -Keith > > - ------- end of forwarded message ------- > > ------- End of Forwarded Message
