[jira] Commented: (MIME4J-62) Unnecessary qp encoding of SPACE and TAB characters in CodecUtil

Stefano Bagnara (JIRA) Mon, 21 Jul 2008 04:29:55 -0700

    [ 
https://issues.apache.org/jira/browse/MIME4J-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615214#action_12615214
 ]


Stefano Bagnara commented on MIME4J-62:
---------------------------------------

Where does the current binaryQuotedPrintable and Base64 encoders come from?
I've not been able to track this down: should we reuse commons-net codecs for 
this "common" outputstreams? (maybe copying them to our codebase, as we may 
want to fix/alter them and to not depend on commons-net for this).

I checked mime4j 0.3 and if I'm not missing anything they was not there (we 
handled temporary files differently).

I think this code has been introduced by Robert 8 weeks ago:
https://issues.apache.org/jira/browse/MIME4J-37?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
http://svn.apache.org/viewvc?view=rev&revision=660013
http://svn.apache.org/viewvc?view=rev&revision=660206

Robert, did you write that code from scratch? Should we try to fix it or 
instead reuse some other ASF code?
Wouldn't be better to move outputstreams to top level classes?

> Unnecessary qp encoding of SPACE and TAB characters in CodecUtil
> ----------------------------------------------------------------
>
>                 Key: MIME4J-62
>                 URL: https://issues.apache.org/jira/browse/MIME4J-62
>             Project: Mime4j
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Niklas Therning
>            Priority: Minor
>             Fix For: 0.4
>
>
> ATM we always encode SPACE and TAB. The result is that the output of the 
> encoding is longer than necessary. According to the MIME RFC:
> (3)   (White Space) Octets with values of 9 and 32 MAY be
>           represented as US-ASCII TAB (HT) and SPACE characters,
>           respectively, but MUST NOT be so represented at the end
>           of an encoded line.  Any TAB (HT) or SPACE characters
>           on an encoded line MUST thus be followed on that line
>           by a printable character.  In particular, an "=" at the
>           end of an encoded line, indicating a soft line break
>           (see rule #5) may follow one or more TAB (HT) or SPACE
>           characters.  It follows that an octet with decimal
>           value 9 or 32 appearing at the end of an encoded line
>           must be represented according to Rule #1.  This rule is
>           necessary because some MTAs (Message Transport Agents,
>           programs which transport messages from one user to
>           another, or perform a portion of such transfers) are
>           known to pad lines of text with SPACEs, and others are
>           known to remove "white space" characters from the end
>           of a line.  Therefore, when decoding a Quoted-Printable
>           body, any trailing white space on a line must be
>           deleted, as it will necessarily have been added by
>           intermediate transport agents.
> To make the encoded output as short as possible we should try to not encode 
> SPACE and TAB unless they are the last character in a line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (MIME4J-62) Unnecessary qp encoding of SPACE and TAB characters in CodecUtil

Reply via email to