Re: handling internationalized headers

Jason R. Mastaler Wed, 16 Oct 2002 09:17:25 -0700

[EMAIL PROTECTED] writes:

> I don't suppose it's possible for the code to guess whether a header
> should be encoded or not...IIUC, no headers should contain any 8-bit
> values so if a header does, I presume it should undergo appropriate
> MIME encoding.


No, because it's completely possible to have a sequence of 7-bit-only
bytes in a charset that has nothing to do with US-ASCII.  Guessing
that there are no 8-bit bytes in that header and not including the
MIME encoding if so will discard the charset information.

ISO-2022-JP is a perfect example.  It's a completely 7-bit encoding,
but the characters in it have nothing to do with US-ASCII (other than
the 'escape' characters).

But if we decided not to MIME encode the header because it contained
8-bit values, then the receiving side would receive what looked like
garbled US-ASCII characters -- the fact that it's actually encoded in
ISO-2022-JP would be completely lost, due to the guess that caused us
to not include the charset information.

> If the guessing cannot be made reliable, this might be a reason to
> have a separate program that interacts w/ the user while attempting
> to guess whether to encode.

Because guessing is next to impossible, I'm more inclined to go with
Tim's suggestion of flagging particular headers in the template with a
charset if that header needs encoding.

This will also allow mixed encodings in the headers.  Otherwise, there
would be no way for someone to use Turkish (ISO-8859-9) letters in the
From: header and Japanese (ISO-2022-JP) in the Subject header, for
example.
_________________________________________________
tmda-workers mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-workers

Re: handling internationalized headers

Reply via email to