On 9/19/19 5:46 PM, Gilles Chehade wrote:
> Hello,
> 
> The RFC for SMTP states the following (section 2.3.8):
> 
>     In addition, the appearance of "bare" "CR" or "LF" characters in text
>     (i.e., either without the other) has a long history of causing
>     problems in mail implementations and applications that use the mail
>     system as a tool.  SMTP client implementations MUST NOT transmit
>     these characters except when they are intended as line terminators
>     and then MUST, as indicated above, transmit them only as a <CRLF>
>     sequence.
> 
> 
> As a result, OpenSMTPD rejects DATA containing <CR> with the following:
> 
>     500 5.0.0 <CR> is only allowed before <LF>
> 
> requiring that clients encode DATA if <CR> is part of it.
> 
> My question is: are we too strict ?
> 
> Not two MTA do the same thing. Some will leave '\r' in the body and then
> write it to the user mailbox or relay it. Other change it into a '\n' or
> skip it. The first ones take the risk of a MUA not handling '\r' well or
> an MTA rejecting later, the second ones break DKIM-signatures.
> 
> The only good way to deal with this is to stick to the RFC ... BUT users
> then experience message rejections when using broken clients (semarie@'s
> printer is an example of one).
> 
> So:
> 
> a- do we leave '\r' in the body ?
> b- do we turn '\r' into '\n'
> c- do we keep strict behavior ?
> d- do we keep strict behavior + provide a knob for '\r' to work ?
> 
> 
To be blunt tl;dr: putting it that simple none of the options are
suitable.

Lets start with 2 definitions in RFC 5321 section 4.1.1.4:
   The receiver normally sends a 354 response to DATA, and then treats
   the lines (strings ending in <CRLF> sequences, as described in
   Section 2.3.7) following the command as mail data from the sender.
   This command causes the mail data to be appended to the mail data
   buffer.  The mail data may contain any of the 128 ASCII character
   codes, although experience has indicated that use of control
   characters other than SP, HT, CR, and LF may cause problems and
   SHOULD be avoided when possible.

   The custom of accepting lines ending only in <LF>, as a concession to
   non-conforming behavior on the part of some UNIX systems, has proven
   to cause more interoperability problems than it solves, and SMTP
   server systems MUST NOT do this, even in the name of improved
   robustness.  In particular, the sequence "<LF>.<LF>" (bare line
   feeds, without carriage returns) MUST NOT be treated as equivalent to
   <CRLF>.<CRLF> as the end of mail data indication.

In other words if we want to be smtp-strict we must be 7bit clean and we
must also not accept '\n' as an end of line. This in turn means that
iobuf.c:177 is wrong in that it makes the '\r' optional. This makes us
loose critical information for dkim signatures.

Additionally the current implementation would also fail a dkimverify:
If someone sends a line without '\r' it would fail because the verifier
can't know if the original line ended in '\r' or not and thus must
assume there is one (in the current situation).

I understand that for "dumb" filters based around basic shell tools
should assume lines end on '\n', but that's not 7bit clean and breaks
valid, but non-unix compatible mails. One way to work around this would
be to add a second command (let's call it filter-dataline7bit) which
sends the lines unaltered and can be used by fully compatible tools to
do proper 7bit clean filtering. If someone has a usecase where they
don't care if the input is 7bit clean (if they know the input is always
properly formatted) they can use the current filter-dataline.

In conclusion: I don't care at all if the input line contains a '\r'
(from a DKIM-perspective), but I do care my input is consistent and I
know how my lines end and what I send back is not altered anymore.

Reply via email to