On approximately 10/9/2009 1:38 AM, came the following characters from the keyboard of Tokio Kikuchi:
Glenn Linderman wrote:
On approximately 10/8/2009 8:47 PM, came the following characters from
the keyboard of Tokio Kikuchi:
Actually, as long as the prepended text is ASCII, all that work can be
done on the encoded value.  When it is not ASCII, it may still be
separated and recognizable.  Still that logic is more complex than
decoding, handling as Unicode, and encoding.... when it works.  Just
pointing out that there is more than one way to do things...
Oh, really?

Base64 is 3 to 4 octets encoding and there is no way to prepend padding.
In header values, encoding is done using encoded-words.  A header value
consists of a sequence of ASCII words, and encoded-words.  While an
encoded word, that uses base64 encoding cannot easily be adjusted to
prepend data into that encoded-word, additional ASCII or encoded-words
can be prepended in front of the other ASCII or encoded words within the
header-value.

So, yes, really!

Following two lines have equivalent header contents:

Re: [mmjp-users 123] =?iso-2022-jp?b?GyRCRnxLXDhsGyhC?=
Re: =?iso-2022-jp?b?W21tanAtdXNlcnMgMTIzXSAbJEJGfEtcOGwbKEI=?=

I'd like to see how you can extract ascii part without touching rest of
the encoded word in the second example.

I can't, and I didn't say I could.

What we do in mailman is that both are treated equally and delete
[mmjp-users 123] from the subject and prefix again by [mmjp-users 124]
(with new sequential number).  Some MUA encode subjects like the second
example and this is beyond our control.  Therefore, we are forced to
decode the whole part of header content.

Yes, if the MUA has created the second encoding, decoding is required in order to replace the header prefix.

If the MUA has created the first encoding, then decoding would not be required in order to replace the header prefix, but the logic to detect which case and handle them separately, results in more complexity in the application.

What I said, was that prefixing a header value with additional text didn't require decoding, and that is true.

What you are saying, is that you want to do more than prefix a header value with additional text.

What you are saying is that you would rather choose to keep the application logic simple, by assuming or requiring that the existing header value is able to be decoded. If that is sufficient for your application, it is a reasonable choice. What do you do with messages for which the header you wish to modify cannot be decoded? Some options would be:

1) bounce the message

2) discard the message

3) determine if the header value is partially able to be decoded, and if the part that can be decoded contains the data you wish to modify, modify it, and simply preserve and pass-through the parts that could not be decoded.

4) if the header value cannot be at all decoded, or the parts that can be decoded do not contain the data you wish to modify, then you could possibly choose to simply prefix information into the header in that case, again preserving and passing through the parts that could not be decoded (or, in this case, the whole value).

Perhaps you can think of other alternatives besides these, feel free to suggest some.

Naturally, doing options 3 or 4 above requires more complex logic for the application than options 1 or 2. The requirements of your application should determine the types of choices you make.

For example, if a new or non-standard charset appears, an application that requires the ability to decode the header, but hasn't been update to understand the charset, will fail to decode it. Yet, if it has logic like 3 or 4, it may be more successful, and would be a more robust application.

--
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

_______________________________________________
Email-SIG mailing list
Email-SIG@python.org
Your options: 
http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com

Reply via email to