[
https://issues.apache.org/jira/browse/MIME4J-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671454#action_12671454
]
Markus Wiederkehr commented on MIME4J-112:
------------------------------------------
> 1. Preservation of comment data after parsing fields
This should not be a problem since every Field stores the original raw field
string. The raw field string is used when writing the message. The only
information lost is the kind of line delimiter that follows the field but this
could easily be preserved, too.
> Another difficulty for unlimited round tripping (without preserving the
> original bits) is how to record the header wrapping for unconventional
> wrapping schemes. For example, a message may choose to wrap header values
> early but this information is lost during parsing.
It is not - see above.
> 2. Preservation of information about character encoding in headers
The field string is built by AbstractEntity using ByteArrayBuffer and
CharArrayBuffer. The CharArrayBuffer uses the following code for converting an
input byte into a character:
int ch = b[i1];
if (ch < 0) {
ch = 256 + ch;
}
It might not be obvious but this is ISO-8859-1 conversion (because unicode code
points 0000 to 00FF correspond directly to ISO-8859-1 byte codes 00 to FF).
So we would only have to use Latin 1 for writing the header fields..
> 3. Ability to build mail which does comply with the specifications
Unclear to me; what specification are you referring to and how is this related
to round tripping?
> My feeling is that - given the availability of standard meta-data+document
> representations - Mime4J should support only limited round tripping for mail
> building representations.
I don't agree because I think that perfect round tripping might be a
prerequisite for S/MIME canonicalization (MIME4J-113). Canonicalization is
useless if bits of the original content have already been lost.
>From my point of view Mime4j also has to preserve to the original transfer
>encodings. Quoted-printable (even base64) cannot be re-encoded the same way it
>was. This might become nasty with inner encodings, for example a message might
>contain another message that is transfer encoded entirely. Mime4j would have
>to parse that inner message only on demand.
Preserving the original transfer encodings clearly causes some overhead and
should be optional IMO..
I think there is not much else to it. The kind of line delimiters between
header and body maybe..
> Define Limits Of Round Tripping In Mime4J
> -----------------------------------------
>
> Key: MIME4J-112
> URL: https://issues.apache.org/jira/browse/MIME4J-112
> Project: JAMES Mime4j
> Issue Type: Task
> Affects Versions: 0.6
> Reporter: Robert Burrell Donkin
> Fix For: 0.7
>
>
> By round tripping, I mean parsing some MIME document into a fully decomposed
> form and then recreating a new version of the document from this form.
> In theory, Mime4J decomposition and recomposition could be made perfect with
> no loss of information. In other words, given a MIME document, the parser
> could completely decompose the document and a bitwise identical copy could be
> recomposed.
> In practice, the limits of support are questionable. Some limitations may be
> expedient. For example, perhaps comments and encoding of ASCII characters are
> not sufficiently important to be worth preserving. Other limitations may
> arise from MIME documents which are not strictly compliant with the
> specification - for example, the use of unescaped non-ASCII characters in
> MIME headers may mean that the output would need to be escaped to ensure
> compliance.
> It is important to define and describe the limits of round tripping so that
> users and developers are clear about the level of support MIme4J claims. In
> addition, sufficient unit tests should be created to ensure in confidence
> that documents within these limits are correctly handled.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.