[mime4j] EOLConvertingInputStream is evil

Oleg Kalnichevski Sat, 05 Jul 2008 09:05:12 -0700

Folks,

I am pretty to close to completing the refactoring of the
MimeTokenStream and related classes. However, before I can proceed I
would like to know whether it would be possible to change the way mime4j
handles line delimiters. I can't help feeling the current approach based
on EOLConvertingInputStream is wrong


(1) EOLConvertingInputStream is a MAJOR performance killer. Presently it
always reads one byte at a time thus completely eliminating all
performance gains at the MimeBoundaryInputStream level. Even if it was
changed to read more data at a time it would still cause significant
performance degradation due to the double buffering of the input data. 

(2) EOLConvertingInputStream is superfluous. In real world situations
one cannot know beforehand whether incoming messages are standard
compliant or not. Basically any reasonable application would always need
to be lenient about line delimiters especially when dealing with MIME
content sent over HTTP.

(3) And finally EOLConvertingInputStream is broken. It touches every
single occurrence of the line delimiter regardless of their position
within the MIME stream. Multipart messages may contain raw binary data
(application/octet-stream). This is especially common when uploading
multipart encoded content over HTTP. EOLConvertingInputStream would
simply corrupt such messages.

Would anyone object if I implemented lenient line delimiter handling
only when parsing header fields / MIME boundary and got rid of
EOLConvertingInputStream?

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[mime4j] EOLConvertingInputStream is evil

Reply via email to