Hi,

I guess I'm not going quiet after all (-;

From: "Jason R. Mastaler"
Subject: Re: handling internationalized headers
Date: Wed, 16 Oct 2002 16:22:32 -0600

> [EMAIL PROTECTED] writes:
> 
> > I guess questions in my mind are:
> >
> >   1) Is it likely that multiple non-ASCII character sets will appear
> >      in headers for a single message?
> 
> This is probably not common, but do we want to give up this ability
> for (IMO) only a mild user-interface improvement?  Some will find this
> useful I think.

I suppose so -- I don't think it's likely, but that's just in my
experience in one part of the world.

> >   2) If not, is knowing the language enough to perform detection to
> >      determine whether to encode a header value (or a portion of a
> >      header value)?
> 
> I don't know the answer to this, but even if this is reliable, I
> wonder how much extra complexity it would require to map all the
> world's languages to their proper charset?

In general my preference is for simplicity -- unfortunately,
simplicity for the user does not necessarily translate into simplicity
for the developer and vice versa.  Of course, it's nicest if you get
simplicity in both places!

> I've gotten into trouble before (e.g, initial keyword address
> implementation) trying to rely on complex heuristics for something
> that should have been specified explicitly.

I think it's very healthy to reflect on one's experience from the past
and try to make use of it in the future (-;

> > What I'm not too excited about is for the user to have to focus on
> > which headers need to be mime-encoded and which headers don't --
> > spelling this info out explicitly and wiring it in some
> > configuration file doesn't excite me.
> 
> What would excite you?  (I'm being serious)

What I would be happier w/ is something equivalent to what a user does
when composing an email message -- the user doesn't have to think
about mime-encoding at all assuming the mail client is well-written.

> > Practically speaking, perhaps the only two headers that are likely to
> > need encoding are From and Subject so may be this is not really much
> > of an issue anyway.  
> 
> Then your confirm_request.txt might look like:
> 
>   From.EUC-JP: "%(FULLNAME)s" <%(recipient_address)s>
>   Subject.EUC-JP: please confirm your message / [EUC-JP text]
>   Reply-To.US-ASCII: %(confirm_accept_address)s
>   BodyCharset: EUC-JP
> 
>   Blah, Blah.. [EUC-JP text]
>
> If you maintain a site installation, you could always modify the
> default templates to reflect the above, so users won't have to think
> about doing this themselves.

In general, I am in favor of being able to specify things explicitly.
In this case, a downside of this is that the user has to learn how to
do it.

One advantage of an interactive process that takes a template w/ no
mime-encoding and asks the user for clarification (after simple
guessing) is, it would free the user from having to learn how to
specify the encoding in a template.  Perhaps the interactive process
could generate templates like the example you showed above.

If the interactive process is a time-consuming endeavor, perhaps it is
not worth it though.

> > As a side note, I don't imagine it'll be much of an issue from a
> > practical standpoint because or rarity, but I have seen multiple
> > charsets used in a single header value - e.g. when From or To has
> > multiple addresses.
> 
> To support this, we'd have to redesign the templates again to allow
> the user to delimit portions of text in a header value with a charset.
> I think I'd rather see if users start requesting this before I think
> of trying to implement it.

I think what you suggest about seeing if there is an actual need makes
a lot of sense.

> I suspect the most common usage of an international header will be
> English + another language, or just another language, and these can
> both be supported with a single charset specification.

I suspect this as well [1].


[1] As a side note, I suppose since Singapore has 3 languages in
    more-or-less widespread use and IIUC people who grow up there are
    required to choose 2, there'd be cases where English might be
    neither of these.  But then, perhaps it's more likely that
    messages would have 3 languages (including English)...Perhaps if
    there's anyone more familiar w/ that area on this list, they might
    be able to make better comments (-;
_________________________________________________
tmda-workers mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-workers

Reply via email to