Re: handling internationalized headers

sen_ml Wed, 16 Oct 2002 22:43:29 -0700

Hi,

From: "Jason R. Mastaler"
Subject: Re: handling internationalized headers
Date: Wed, 16 Oct 2002 19:54:27 -0600


> [EMAIL PROTECTED] writes:
> 
> > To clarify, I didn't mean writing an editor -- what I meant was
> > something like:
> >
> >   1) User creates template using favorite editor
> >
> >   2) User runs it through program to guess and perform encoding for
> >      headers
> >
> > Does that seem like it's a lot of work?
> 
> How would #2 work?  It's not possible given a sequence of bytes to
> guess what charset it should be encoded with.  We discussed this point
> on mimelib-devel today actually.  Quoting Ben Gertzfield:
> 
>   ``It's sometimes possible for Asian charsets, but not for Latin ones
>   without some deep statistical analysis.  All the ISO-8859-* charsets
>   use the byte values above 128 for distinct characters, but there's
>   no pattern of their use.''

I understand that in the most general and abstract case it may not be
possible.  If you had an interactive process that got feedback from
the user, it could learn about preferences and perhaps provide better
guesses in the future.  I got the feeling that a given user's template
preferences would not change that much over time.

> > Just a thought -- it seems to me that if you are going to provide a
> > way to be explicit about header encodings, you could also let the user
> > specify what encoding their template is stored in.
> 
> I'm already doing this.
> 
> To clarify, the Header.CHARSET syntax allows the user to specify the
> input charset.  How to properly encode that in headers and body is
> determined automatically by the Python email package when the message
> is sent.

Ah, I see.  I was confused about the meaning.  Thanks for straighten
me out.

> > FWIW, I'm perfectly happy w/ templates in EUC-JP.
> 
> That's good, because you don't have any other choice <wink> because of
> the template string substitution issue.  I believe it was you who
> reported problems with templates in ISO-2022-JP.

Yes, I believe I did (-;  

>From my perspective, this particular problem arises from an
unfortunate coincidence of:

  1) A mechanism in Python being a certain way

  2) TMDA employing that particular mechanism

  3) ISO-2022-JP escape sequences being a certain way

It's very unfortunate that Python didn't get a lot of feedback from
Japanese-using users early in its development -- I suspect if it had,
this particular problem would not exist.

_________________________________________________
tmda-workers mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-workers

Re: handling internationalized headers

Reply via email to