Re: handling internationalized headers

Jason R. Mastaler Wed, 16 Oct 2002 14:56:28 -0700

[EMAIL PROTECTED] writes:

> I guess questions in my mind are:
>
>   1) Is it likely that multiple non-ASCII character sets will appear
>      in headers for a single message?


This is probably not common, but do we want to give up this ability
for (IMO) only a mild user-interface improvement?  Some will find this
useful I think.

>   2) If not, is knowing the language enough to perform detection to
>      determine whether to encode a header value (or a portion of a
>      header value)?

I don't know the answer to this, but even if this is reliable, I
wonder how much extra complexity it would require to map all the
world's languages to their proper charset?

I've gotten into trouble before (e.g, initial keyword address
implementation) trying to rely on complex heuristics for something
that should have been specified explicitly.

It's not as seamless and sexy to have to type in the charset, but it
will always be predictable and accurate.

> What I'm not too excited about is for the user to have to focus on
> which headers need to be mime-encoded and which headers don't --
> spelling this info out explicitly and wiring it in some
> configuration file doesn't excite me.

What would excite you?  (I'm being serious)

> Practically speaking, perhaps the only two headers that are likely to
> need encoding are From and Subject so may be this is not really much
> of an issue anyway.  

Then your confirm_request.txt might look like:

  From.EUC-JP: "%(FULLNAME)s" <%(recipient_address)s>
  Subject.EUC-JP: please confirm your message / [EUC-JP text]
  Reply-To.US-ASCII: %(confirm_accept_address)s
  BodyCharset: EUC-JP

  Blah, Blah.. [EUC-JP text]

If you maintain a site installation, you could always modify the
default templates to reflect the above, so users won't have to think
about doing this themselves.

> As a side note, I don't imagine it'll be much of an issue from a
> practical standpoint because or rarity, but I have seen multiple
> charsets used in a single header value - e.g. when From or To has
> multiple addresses.

To support this, we'd have to redesign the templates again to allow
the user to delimit portions of text in a header value with a charset.
I think I'd rather see if users start requesting this before I think
of trying to implement it.

I suspect the most common usage of an international header will be
English + another language, or just another language, and these can
both be supported with a single charset specification.

> Anyway, I suppose no one is forcing me to use this mechanism so I
> think I'll go quiet on this now (-;

Well, if you don't speak up now, you'll be stuck with whatever I come
up with <wink>.  That's the point of this list, to toss ideas around
before an implementation makes it into release.  No need to go quiet.
_________________________________________________
tmda-workers mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-workers

Re: handling internationalized headers

Reply via email to