Moreover, if "utf-8" is to be a charset, it should have defined the specific set of characters and contains versioning tags.
I suggest: "utf-8" is an encoding form for UCS. "utf8-8-3.1" and "utf-8-3.2" are charsets names which are snapshots of UCS with specified encoding form utf-8. This taxonomy would make this situation clear. Do not forget the set of characters of vague "utf-8" is not fixed from its definition. Soobok Lee ----- Original Message ----- From: "Soobok Lee" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]>; "Paul Hoffman / IMC" <[EMAIL PROTECTED]> Sent: Monday, June 03, 2002 11:03 AM Subject: Re: [idn] utf8/legacy versioning > > Thanks for your correction. UTF-8 (not utf8) is in that list. > > But, UTF-8 is a character encoding form of UCS and does not specify the specific set >of supported characters > in the numerous versions of UCS. That is the point where utf-8 and >iso8859-1/ksc_5601_1987 differ. > I.e., UTF-8 defines encoding schemes over UCS which has been an open set and will >remains as an open set. > > UTF-8 , from its definition, cannot have versioing suffices, like "utf-8-3.2" or >"utf-8-3.1". > That's why "utf-8" should not be regardsed as a "genuine" charset, IMO. > > Correct me if i am wrong at some points. Thanks. > > Soobok Lee > > ----- Original Message ----- > From: "Paul Hoffman / IMC" <[EMAIL PROTECTED]> > To: "Soobok Lee" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> > Sent: Monday, June 03, 2002 10:36 AM > Subject: Re: [idn] utf8/legacy versioning > > > > At 10:11 AM +0900 6/3/02, Soobok Lee wrote: > > >Moreover, It does not have "utf8" charset entry, because "utf8" is > > >just one of the encodings of the Universal > > >Character Set, not an independent charset plus encoding like "ks_c_5601-1987". > > > > Both your statement and your reasoning are wrong. UTF-8 has been a > > registered charset since RFC 2279 was issued. > > > > --Paul Hoffman, Director > > --Internet Mail Consortium >
