here's what I know:

according to RFC 977, the newsgroups names are supposed to be in ASCII.

some news servers are set up with non-ASCII newsgroup names.  (note, 
netscape's old news server did not allow you to do that.)

To fix bugs relating to this, I have used external news servers which 
were set up with non-ASCII newsgroup names.

What ever the news server send us, we treat as latin-1.

We do that since we have no way of knowing what the news server system 
charset is.

4.x does the same thing, as do other news clients (like sfraser's 
MT-NewsWatcher).  I'm not sure what outlook does.

see 
http://bugzilla.mozilla.org/buglist.cgi?bug_status=RESOLVED&bug_status=VERIFIED&email1=sspitzer&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&short_desc=non+ASCII&short_desc_type=allwordssubstr&long_desc=&long_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_type=allwordssubstr&status_whiteboard=&status_whiteboard_type=allwordssubstr&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&newqueryname=&order=Reuse+same+sort+as+last+time

for some related bugs.

-Seth

Dan Mosedale wrote:

> [EMAIL PROTECTED] (Yung-Fong Tang) writes:
> 
>>According to the original NTTP protocol RFC977, it is NOT allowed.
>>However, who only follow the original RFC anyway.
>>
> 
> Indeed; RFC 977 is quite dated.  I do know that in Europe and Asia
> there are plenty of non-ASCII newsgroup names in use.  I don't know
> what encoding(s) they use, but given the age of some of these
> hierarchies, I suspect it's not just UTF8.  Added the .mail-news group
> to this post, as I bet some of the folks there know more.  Anyone?
> 
> 
>>http://info.internet.isi.edu/in-notes/rfc/files/rfc977.txt :
>>
>>2.2.  Character Codes
>>
>>   Commands and replies are composed of characters from the ASCII
>>   character set.  When the transport service provides an 8-bit byte
>>   (octet) transmission channel, each 7-bit character is transmitted
>>   right justified in an octet with the high order bit cleared to zero.
>>
>>"RFD 2980 Common NNTP Extension"  ( http://www.ietf.org/rfc/rfc2980.txt )
>>does not improve it
>>However, look at the following. The up-to-dated Internet Draft for NNTP
>>http://www.ietf.org/internet-drafts/draft-ietf-nntpext-base-13.txt
>>say the following
>>
>>3. Introduction
>>
>>.....
>>            Every attempt is made to insure that the protocol
>>            specification in this document is compatible with the version
>>            specified in RFD 977[1]. However, this version does not
>>            support the ill-defined SLAVE command and permits four digit
>>            years to be specified in the NEWNEWS and NEWGROUPS commands.
>>            It changes the default character set to UTF-8[2] instead of
>>            US-ASCII[3]. It also extends the newsgroup name matching
>>            capabilities already documented in RFD 977.
>>.....
>> 4. Basic Operation.
>>....
>> The character set for all NNTP commands is UTF-8.
>>.....
>> 5. The WILDMAT format
>>
>>            The WILDMAT format[5] described here is based on the version
>>            first developed by Rich Salz which was derived from the format
>>
>>            used in the UNIX "find" command to articulate file names. It
>>            was developed to provide a uniform mechanism for matching
>>            patterns in the same manner that the UNIX shell matches
>>            filenames. Patterns are implicitly anchored at the beginning
>>            and end of each string when testing for a match.  There are
>>            five pattern-matching operations other than a strict one-to-
>>            one match between the pattern and the source to be checked for
>>
>>            a match. The first is an asterisk (*) to match any sequence of
>>
>>            zero or more UTF-8 characters. The second is a question mark
>>            (?) to match any single UTF-8 character.
>>
>>.....
>>            Implementers must be careful to apply the pattern-matching
>>            operators to whole characters encoded in UTF-8, and not to
>>            individual octets.
>>.....
>>
>>Naoki Hotta wrote:
>>
>>
>>>I think that's no but I am not sure.
>>>Cc to Momoi san and Frank.
>>>
>>>Naoki
>>>
>>>Xianglan Shirley Ji wrote:
>>>
>>>
>>>>Hi Naoki,
>>>>
>>>>Can newsgroup name be non-ASCII?
>>>>If yes, do you know if there is any news server holding this kind of
>>>>newsgroup?
>>>>
>>>>Thanks,
>>>>Xianglan
>>>>
>>>>


Reply via email to