Hi Ambrose, and thanks for the interesting suggestion.

 On Thursday, December 7, 2006 at 11:03:01 -0500, Ambrose Li wrote:

> In practice, GB-based Chinese emails with traditional characters are
> tagged as "GB2312" even though technically speaking they are in
> GB18030.

    So you can correct this insufficient MIME label by aliasing it as:

| charset-hook ^gb2312$ gb18030

    Let's evaluate the thing. The first charset is a (nearly) perfect
subset of the second, so there should not be any drawback. In theory, 2
chars are different. But those are an EM DASH against an HORIZONTAL BAR,
and a MIDDLE DOT against a KATAKANA MIDDLE DOT. Probably same glyph, or
not distinguishable... In practice probably not a drawback at all.

    Benefit: How many such under-labelled mails do you receive? One from
time to time, 10%, 90%, ??

    Conclusion: I'm all for including this line to Debian's /etc/Muttrc.
What's your opinion, Dato?

| # Some GB18030 traditional Chinese mails are wrongly labelled GB2312.
| # The first charset is a superset of the second. Let's alias it:
| charset-hook ^gb2312$ gb18030

    I'm not for upstream inclusion, because I'm not sure if every iconv
library out there knows GB18030. But I'll probably begin to advice it.


Bye!    Alain.
-- 
« if you believe the Content-Type header, I've got a bridge to sell you. »

Reply via email to