Markus Scherer wrote:
[EMAIL PROTECTED] wrote:

We are talking about charset value for the internet protocol here. It is a special narrow field of charset name. The value used by Internet protocol are defined by a well defined process- http://www.faqs.org/rfcs/rfc2278.html RFC 2278 - IANA Charset Registration Procedures

"well defined process" is a stretch. By the way, RFC 2978 replaced 2278 a few years ago.

I think exactly the same way as Markus on all his points. It's regrettable but true that all sort of garbages (in addition to well-defined useful character encodings) were thrown into it and were accepted almost blindly. In other words, there's a serious quality control issue.


The problem with the IANA charset _list_ is that it lists not only useful charset names but also
- names that are illegal for charsets by its own rules
- names for things that are not (verifiably) charsets at all

Markus may have something different in mind, but I'd add to this category coded character set names like ks_c_5601-1987 that are not suitable for use as MIME charset.


- names for charsets that cannot be implemented reliably
  because there is no online, machine-readable specification for them,
  and even for the ones where there is one, it is not usually
  a mapping to/from Unicode

Related to the last point is that some charsets are not properly annotated.


Jungshik




Reply via email to