On June 29, 2005 at 23:19, Jeff Breidenbach wrote: > I've seen a small but not tiny number of messages where the > Mail User Agent is sticking raw iso-8859-1 characters (outside > the ASCII range) inside the Subject: header. And not using > an RFC 2047 encoding. Our software is barfing on those > characters when we convert to UTF-8, for example: > > http://www.mail-archive.com/[email protected]/msg09684.html
They are illegal. Only ASCII characters are allowed (hence the reason for non-ASCII encoding in the MIME specs). However, some locales "bend" the rules. With mhonarc, you can try the following trick: <CharsetAliases> iso-8859-1; plain </CharsetAliases> "plain" is the special charset name for characters in message header fields that are not part of a non-ASCII encoded string. By default, mhonarc treats "plain" as "us-ascii", but you can use the above resource to change this. Wrt to TEXTENCODE, this will cause the "plain" text to be considered as iso-8859-1 text for purposes of encoding (in your case utf-8). --ewh P.S. You may want to also look at DEFCHARSET for text message bodies since the problem you cited can also happen for text bodies. You may want to add the following to your mhonarc resource files: <DefCharset> iso-8859-1 </DefCharset> P.S.S. The above changes will only affect new messages unless you RECONVERT existing messages. _______________________________________________ Discussion list for The Mail Archive [email protected] http://jab.org/cgi-bin/mailman/listinfo/gossip
