Hello,
On Sun, 12 Feb 2006, Henne Vogelsang wrote:
>> Also strange that the first two are not in alphabetical order.
>
>I guess because of special characters (é, ö) which are by RFC not
>allowed in mail headers.
RfC 2047 says something else. Any non-7-bit ASCII has to be encoded
though. But most user-agents have faults encoding and decoding such
headers. I know of only 2 not-obviously-faulty MUAs[0], and even those
can fail on decoding/reencoding some of the errors introduced by other
so-called "MUA"s.
Ergo: One SHOULD NOT use anything but plain 7-bit ASCII in headers.
And about sorting possibly differently encoded Subjects and Froms in
the ML-Archive: Don't even think about it!
Henne: let the archiving software decode validly encoded headers _once_
to cater for "sane" headers and forget about doing anything more.
There are just too much broken headers out there, with spurious
whitespace (and linebreaks) in the middle of encoded-words, with
multiply encoded "words"[1] etc.pp... If you try to implement/cater
for that you'll go insane. There's ALWAYS yet another piece of crap
software sending mails with yet another way of COMPLETELY fucking up
the headers...
-dnh
[0] gnus and mutt
[1] such as '=?UTF-8?Q?=3D=3FUTF=2D8=3FQ=3F=3DC3=3DBC=3F=3D?=' and
that's a trivial case of encoding twice with only one charset and
only QP involved... Go figure what it'd be with Base64 and a bunch
of different charsets and, more important, breaking the encoding
inbetween, e.g. inserting linebreaks etc... e.g. (still a "simple"
case):
Subject: =?UTF-8?Q?=3D=3FUTF=2D8=3FQ=3F=3DC3=3F=3D?=
=?UTF-8?Q?=3D=3FUTF=2D8=3FQ=3F=3DBC=3F=3D?=
And that's a single letter "Subject"! You see where I'm getting
at?
--
<groan> Oh Ghods, not another personality...I'm losing track.
Look: Could everyone who is me, please let me know? Thanks. -- Jim
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]