Ken, you are coming in very late in this process. A lot of this was debated back and forth early in the process, both on mail and in personal contact. I suggest for a start that you review all of the archives so that you don't simply retread issues.
Mark ————— Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο πάντα — Ὁμήρου Μαργίτῃ [For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr] http://www.macchiato.com ----- Original Message ----- From: "Kent Karlsson" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, February 07, 2002 15:25 Subject: [idn] Comments on IDNA/stringprep/nameprep Comments on IDNA/stringprep/nameprep 1. stringprep and nameprep should be rejoined to a hostnameprep. They are only about host name preparation, not any other name preparation. Similar preparations may still take advantage of the hostnameprep document, by declaring "deltas", small changes that may be needed for other (Internet, DNS) names. That would likely minimise the size of the "reuse" documents. 2. hostnameprep should be applied to the *entire* hostname; i.e. the entire name should be 'mapped' in the same way *before* it is parsed into parts. 3. Various FULL STOPs should be mapped to FULL STOP, which must be allowed. Some of this is accomplished via NFKC, but some mappings need be added specially for hostnameprep, e.g. IDEOGRAPHIC FULL STOP. (Note that parsing into parts should come after the mapping and prohibition steps.) 4. Various Pd (punctuation dash) should be mapped to HYPHEN-MINUS by hostnameprep. Future keyboards may generate HYPHEN rather HYPHEN-MINUS (except perhaps in "programming language mode", which few will use). At least, hostnameprep should not prevent such a development. 5. Symbols/punctuation/dingbats (except the hyphen-like dashes) should not be allowed by hostnameprep; and all of that prohibition should be in hostnameprep, not some of them handled differently elsewhere. Punctuatuation in particular, in contexts where hostnames are embedded, may in future syntaxes use non-ASCII punctuation adjacent to the hostname. At the very least such a development should not be prevented by hostnameprep. Symbols are at present excluded, and should remain so also for non-ASCII symbols, for the same reason as punctuation should be excluded. 6. Hangul syllables (with conjoining characters, not non-conjoining compatiblity characters) that represent the same syllable must be mapped to the same representation. Due to unfortunate historic reasons, this does no longer happen automatically with NFKC (though for drafts for NFKC it did). Mappings should be added so that "syllabically" equivalent Hangul conjoning characters are mapped to a common representation. Hangul compatibility letters should be prohibited though. Correctly mapping those is more complicated than can be expressed in the (current form of) (host)nameprep mappings. Hangul compatibility letters should instead be prohibited. (Mapping table for Jamos, and prohibition table for Hangul compatibility characters, are available upon request.) Future keyboard, e.g., input may generate only single letter Jamos, rather than any "cluster letter" Jamos or precomposed Hangul syllable characters. At the very least, hostnameprep should not prevent such a development. 7. No document associated with hostnameprep should make any further restrictions on domain/host names than hostnameprep itself. (In addition, duplicating some of the restrictions elsewhere is confusing and should not be done.) 8. Note: The SC/TC issue cannot be solved at a near-impossible- to-change (once deployed) technical level, but should instead be solved at a policy level (which may employ software with relatively easy to change mappings). 9. User interfaces that encounter mixed script hostname *parts* should be recommended to "flag" them (ballon warning, color differentiate, make blinking, bounce automatic registratations, ...). /Kent Karlsson
