Robert Burrell Donkin ha scritto:
On 7/18/08, Stefano Bagnara <[EMAIL PROTECTED]> wrote:
2) ((TextBody) b).getReader(). This give me a reader, so this support
the "line" concept: I do expect this one to treat "non canonical"
newlines like the header/structure parser: if headers are allowed to
terminate with an isolated LF then also lines in text content should do
the same (because probably the whole mime message has LF instead of
CRLF). [RFC seems to suggest that the fact is that the MIME message is
encoded using LF instead of CRLF and that this specific encoding breaks
binary parts, but we want to be smarter wrt this issue].
TextBody is part of the DOM. This can and should be addressed there
(rather than in the parser). I think that doing this should satisfy
both needs without compromising the performance of the parser.
I don't think that LF/CR replacement is a performance issue: most
probably the current implementation of the filter stream has performance
issues, but this does not mean that replacing newlines is at all an
issue (we already have to scan for LF/CR/CRLF anyway).
Indeed if we think that it is better not to do that during the parsing
there is no need to talk about performance (I admit I don't know mime4j
internals and I don't know the format used to temporarily store parts to
disk so I don't know if it is better to alter it while writing them or
while reading them back).
What about RootInputStream line counting? How should we update it if we
support isolated LF/CR in lenient parsing? How should it behave wrt
"binary encoded" parts?
Stefano
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]