I did not mention Arabic vowels and Shadda because I don't feel qualified to. Jony > -----Original Message----- > From: Paul Hoffman / IMC [mailto:[EMAIL PROTECTED]] > Sent: Saturday, July 08, 2000 8:06 PM > To: Jonathan Rosenne; [EMAIL PROTECTED] > Subject: RE: [idn] Preparation of Internationalized Host Names - Hebrew > > > At 12:43 PM +0300 7/8/00, Jonathan Rosenne wrote: > > > Please note that not all punctuation is prohibited. The rules for the > >> specific kinds of punctuation that is prohibited are in the document. > > > U+05C0, which looks just like the ASCII "vertical bar", is probably > >> acceptable (since vertical bar is acceptable). U+05C3 looks just like > >> a colon and is therefore not acceptable; thanks for pointing this > >> out. (And I have noted it to the Unicode folks for when they update > >> the standard). > > > >Its meaning is punctuation, like comma or full stop, never mind > its shape. > > Exactly my point. At present, we do *not* prohibit all punctuation. > The only prohibited punctuation are characters are that are reserved > or delimiters in URLs [RFC2396] and [RFC2732]. If this group decides > to prohibit all punctuation, certainly we would then prohibit U+05C0. > Or, we might prohibit all punctuation other than a certain small > group of characters (which would be pretty difficult to choose > correctly...). But, for now, we only prohibit a small set. > > > > >2. Cantillation Marks > > > >0591 to 05af > > > > > > > >These should be either prohibited or ignored since they do > not affect > >> >pronunciation, similar to ignoring case differences. > >> > > >> >Personally, I would rather prohibit them since their presence is > >> most likely > >> >to be an error. > >> > >> If they never appear in personal names, company names, or spoken > >> phrases, then they can safely be prohibited. Is that true for all of > >> them? > > > >They never appear in common use, they are only used in biblical texts. > > Thanks, that's what I wanted to hear. I'll prohibit them in the > next draft. > > > > >2. Points > >> >05b0 to 05c4 > >> > > >> >These should be either prohibited or ignored since they are > optional. In > >> >modern Hebrew they are seldom used, not all systems support > >> them, and it is > >> >valid to omit them. > >> > > >> >Personally, I would rather ignore them because a user may enter > >> them and why > >> >not let him. > >> > >> This is much more problematic. We do not currently have any "ignored" > >> characters. If I understand this correctly, the host name <HEBREW > >> LETTER HE><HEBREW POINT SEGOL>.com looks and sounds different than > >> <HEBREW LETTER HE><HEBREW POINT TSERE>.com, but could be considered > >> the same for a host name. If so, I think we would have to prohibit > >> them, not ignore them. Does that sound correct? > > > >They do sound different, but do not necessarily look different > because it is > >not mandatory to display points. > > > >Just like you ignore case in English, in Hebrew you should ignore points. > > From my (very limited) understanding of Hebrew, this makes sense. > However, it means that we will have to make such other "ignoring" > rules for a variety of scripts. I'm happy to do that if the group > wants, but it certainly makes the name preparation harder. (Just to > be clear: my personal preference would have been not to ignore case, > but that decision was made *long* ago and cannot be reversed.) Doing > so would require an extra step, probably between checking for > prohibited characters and folding case, that says "look for any > characters on this list and throw it away". > > How does the group feel about this? What other characters in scripts > other than Hebrew would go here? > > --Paul Hoffman, Director > --Internet Mail Consortium

