QP encoding and dotstuffing issues (let's encode the "." to "=2E")
------------------------------------------------------------------

                 Key: MIME4J-186
                 URL: https://issues.apache.org/jira/browse/MIME4J-186
             Project: JAMES Mime4j
          Issue Type: Improvement
          Components: dom
    Affects Versions: 0.7
            Reporter: Stefano Bagnara
            Priority: Minor
             Fix For: 0.7


There are non compliant SMTP/POP tools/gateway/filters out there doing bad 
stuff with dot stuffing.

I send trackable emails and I have trackable urls with "." in their path: I 
estimated that when a "." ends up at the end or at the beginnning of a line (in 
the qp encoded html part) between 0.5% and 1% of recipients receive a bad url 
(having ".." instead of "." or  having the stripped ".")

I identified at least AVG spam filter show a bad behaviour when filtering spam 
for generic mail client (but not when used with outlook). It seems that AVG 
intercept the tcp connection and does its own stuff and this way it breaks when 
dots are at the beginning or end of a line.

Of course the example is about an url because it is the one I'm able to monitor 
and to have statistical evidence, but this happens with any DOT in the message, 
even in text plain parts. You understand that having the message "altered" also 
break dkim/gpg signatures.

One way to fix this is to change the Quoted Printable encode to make sure to 
encode also the DOTs. This make the QP encoded part a bit less readable (who 
reads them manually today??) but it protect the stream from uncompliant mail 
agents. We could even encode the "." only when it is the first or the last of 
the line but I think it would be a "weird" behaviour so I propose to simply add 
the "." to the list of chars to be encoded.

I think that this could be the new behaviour and that making this configurable 
is a bit "over-configurability", but if people things it should be configurable 
then please propose a way to configure the behaviour. The RFC give us freedom 
with this regard (specify what chars HAVE TO be encoded and what MAY be left 
unencoded, but one could even encode every single char).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to