"Adam M. Costello" wrote: > > "Eric A. Hall" <[EMAIL PROTECTED]> wrote: > > > > For a given domain name slot (protocol element, structured data > > > field, function argument, etc), the governing specification...does > > > not dictate what you may do with the name after you've read it. > > > > True for message fields, false for the loosely-coupled data-types > > which are independent of a particular of any particular protocol > > message (email addresses, Message-ID, URLs, etc). > > I'm not getting your point at all. Apparently you are concerned that > there exist scenarios where ToUnicode ought to be forbidden/discouraged, > but the IDNA draft allows/encourages it. Maybe if you described a > specific example of such a scenario, I'd start to understand.
I don't know how to say this any different than I have been saying it. The ToUnicode step changes well-known and widely-used data-types in such a way that the data no longer conforms to the rules which govern that data. These changes are harmful when the extended data-types are reused, whether by copy-n-paste, program output, direct transcription, or whatever. An example I have already given is Message-ID. Basic functionality will be broken if the structure of the well-known and widely-used Message-ID data-type as defined in STD11 is extended beyond the scope of the governing spec. The extended, STD11-incompatible form will break on search inputs that don't allow those values, it will break if the search input accepts it and passes it to a remote system via an IMAP SEARCH or NNTP XPAT operation, it will break if a user puts it into a news or http URL which gets transliterated and then percent-hacked, and it will break if a user types it into a web URL as a parameter to a server-based search function. That's just search and fetch, nevermind other problems like damage to threading that results from an extended Message-ID which is manually added to See-Also or References, corrupted spam complaints, and the dozens of other common uses for this well-known, widely-used, STANDARDIZED data-type. This does not mean that users with mail systems located in IDN zones cannot generate Message-IDs from those domains, only that they have to continue generating them with LDH domains (via ToASCII). In that scenario, Message-ID would continue to function across all of the services which utilize the format specified in STD11. That is the way it MUST be until that data-type is either extended (after months of rancorous debate, surely), or it is replaced with something else, or the WG in charge of that standard choose to allow for transliteration (THEM, NOT US). Note well: the subsequent rancorous debates are not the problems of this WG anymore than extending the data-types is our within our scope of responsibility. They are smart enough to figure out what they can get away with on a case-by-case basis. Chorus: Implicitly extending all such data-types currently in use on the Internet (as the current draft does) is willfully causing breakage, and goes well beyond the authority of this WG. -- Eric A. Hall http://www.ehsco.com/ Internet Core Protocols http://www.oreilly.com/catalog/coreprot/
