This message contains responses to both James Seng and Soobok Lee.

James Seng <[EMAIL PROTECTED]> wrote:

> My believe is what is allowed in host labels is a topic for the zone
> administrator to decide. .CN have a different set compared to .SG
> compared to .COM compared to say IBM.COM.

Zone administrators can always impose their own restrictions, but that
still leaves us with the question of what the IRI spec should say about
what characters are allowed in the host field of IRIs.

The historic precedent is that ASCII punctuation and symbols are allowed
in ASCII *domain* names, but not in ASCII *host* names, and not in the
host field of URIs.

Should IRIs be more loose and allow non-ASCII punctuation and symbols
in the host field (while continuing to disallow ASCII punctutation
and symbols)?  Or should IRIs try to apply an old tradition to a new
situtation, and disallow punctuation and symbols?

Soobok Lee <[EMAIL PROTECTED]> wrote:

> > L: letter
> > M: mark
> > N: number
> > P: punctuation
> > S: symbol
> > Z: separator
> > C: other
> 
> May I add this?
> 
>   U: unassigned code points.

I see your motivation.  The classes I listed are all the ones mentioned
in the Unicode character database, but of course the database covers
only assigned code points.  All code points not mentioned in the
database are unassigned, and we could view that as another class.

>  U should be also allowed in addition to L,M,N.

ToASCII and Nameprep already take an input flag indicating whether
unassigned code points are to be allowed or prohibited.  My proposal
wouldn't change that.

AMC

Reply via email to