"Adam M. Costello" wrote: > > Yves Arrouye <[EMAIL PROTECTED]> wrote:
> > why bother with the 0000-0020 and 007F > > I've wondered about that myself. The nameprep draft gives the reasons > for all the prohibited code points. I'm not sure how convincing all the > reasons are, but you can look at them and judge for yourself. I haven't yet had a chance to go through the latest drafts (other deadlines) so this is coming from a slightly ignorant perspective, but I would ask that the IDNA team re-evaluate their position on this matter. First of all, those code-points should not be universally prohibited, since they are absolutely legitimate in STD13 domain names. ANY octet value is allowed. If you are going to be moving SOME of the prohibited characters from nameprep to the IDNA hostname processing, then you need to move ALL of them at that stage. However, this is not the approach I would recommend. Instead, I would recommend that a series of nameprep-like documents be prepared under the stringprep umbrella, with one document for each of the known data-types. For example, there should be a document for i18n hostnames, there should be another document for STD13 hostnames (to act as an IDNA output filter), another document for current RFC 2822 email addresses, and so forth. Each of these documents can then be used as data-type inputs. Or to put it in another perspective, each of them can be treated somewhat like Unicode "property" tables, with certain types of output only being allowed to contain characters from the relevant property. I'm sure that somebody will say something about this being too late, but there you have it. At the very least, you should consolidate the prohibited characters into IDNA, as the prohibited characters which appear to be in nameprep are in fact valid for STD13 domain names. I will try to read the last-call drafts and make more informed commentary tomorrow or Friday. -- Eric A. Hall http://www.ehsco.com/ Internet Core Protocols http://www.oreilly.com/catalog/coreprot/
