Charles, you missed the cases introduced by RFC 3030 and the CHUNKING
ESMTP extension.
Further comments below.
Tony Hansen
[EMAIL PROTECTED]
Charles Lindsey wrote:
> Let us be VERY careful here. Start from RFC 2822:
>
> message = (fields / obs-fields)
> [CRLF body]
> body = *(*998text CRLF) *998text
>
> So a <body> can be EMPTY, and its last line might not have a CRLF.
>
> The CRLF following the header fields is NOT part of the <body>.
>
> If the <body> is absent (indistinguishable from an empty <body>) that
> CRLF after the header fields can be omitted.
>
> Now look at RFC 2821:
>
> The mail data is terminated by a line containing only a period, that
> is, the character sequence "<CRLF>.<CRLF>" (see section 4.5.2). This
> is the end of mail data indication. Note that the first <CRLF> of
> this terminating sequence is also the <CRLF> that ends the final line
> of the data (message text) or, if there was no data, ends the DATA
> command itself.
>
> So, even if you have a body with no CRLF, as permitted by RFC 2822, you
> can't actually transmit it by RFC 2821
Correct, 2822 requires complete lines to be transmitted, that is, lines
ending in CRLF.
> (well, you might transmit it by
> UUCP, and you might encapsulate in in a message/rfc822 within some
> multipart).
Add in RFC 3030 and CHUNKING, and you get the ways of transmitting
messages using ESMTP that are RFC 2822-compliant but not RFC
2821-compliant. Add in the MIME RFCs and you also eliminate the
requirements that the lines be limited to 998 characters and consist of
text.
> So we have the following cases. The dotted lines enclose what is, by RFC
> 2822 definition, the <body>, and is therefore what will get hashed or
> canonicalized by dkim-base, as currently worded. The ".CRLF" is the RFC
> 2821 DATA terminator.
Correct.
> 1) ordinary message with <body> of 1 non-empty line:
>
> Last-Header: foobarCRLF
> CRLF
> ---------------------
> barbazCRLF
> ---------------------
> .CRLF
>
> 2) <body> consisting of 2 empty lines
>
> Last-Header: foobarCRLF
> CRLF
> ---------------------
> CRLF
> CRLF
> ---------------------
> .CRLF
>
> 3) <body> consisting of 1 empty line
>
> Last-Header: foobarCRLF
> CRLF
> ---------------------
> CRLF
> ---------------------
> .CRLF
>
> 4) <body> containing no lines
>
> Last-Header: foobarCRLF
> CRLF
> ---------------------
> ---------------------
> .CRLF
>
> 5) message with absent <body>
>
> Last-Header: foobarCRLF
> .CRLF
>
> Now apply simple canonicalization to all those cases, using:
>
> "In more formal terms, the "simple" body canonicalization algorithm
> converts "0*CRLF" at the end of the body to a single "CRLF"."
>
> Making the entirely reasonable assumption that "body" means exactly what
> RFC 2822 defines it to mean, then here is what gets hashed in all of
> those cases:
>
> 1) ordinary message with <body> of 1 non-empty line:
> ---------------------
> barbazCRLF
> ---------------------
>
> 2) <body> consisting of 2 empty lines
> ---------------------
> CRLF
> ---------------------
>
> 3) <body> consisting of 1 empty line
> ---------------------
> CRLF
> ---------------------
>
> 4) <body> containing no lines
> ---------------------
> CRLF
> ---------------------
>
> 5) message with absent <body>
> ---------------------
> ---------------------
I contend that the current wording in base-07 also requires that example
5 canonicalize into a
---------------------
CRLF
---------------------
Even when the body doesn't exist, it still must be treated as having 0
lines following, which still canonicalize to a CRLF.
But even with my contention on case #5, I don't disagree with your
conclusions here:
> That is undoubtedly what the "formal terms" in dkim-base undoubtedly SAY.
>
> It is NOT what the "informal" words in dkim-base say.
> It is NOT what version -01 of DK says.
> It is NOT what version -06 of DK says.
> It is NOT what Eric's three examples claim.
> It is entirely possible that is is NOT what dkim-base was INTENDED to say.
That's why the issue was raised.
I firmly believe that we *intended* to canonicalize each of these cases
into the empty body
---------------------
---------------------
Tony Hansen
[EMAIL PROTECTED]
PS. For completeness, the only missing cases, after taking into
consideration RFC 3030 and MIME, are as follows. *These* are the reason
that the 0*CRLF rule was added and where it needs to be applied:
6) ordinary message with <body> of >1 non-empty line, not ending in CRLF
Content-Type: binary
Last-Header: foobarCRLF
CRLF
---------------------
somethingCRLF
anything
---------------------
7) ordinary message with <body> of 1 non-empty line, not ending in CRLF
Content-Type: binary
Last-Header: foobarCRLF
CRLF
---------------------
anything
---------------------
Now apply simple canonicalization to all those cases, using:
"In more formal terms, the "simple" body canonicalization algorithm
converts "0*CRLF" at the end of the body to a single "CRLF"."
This winds up adding a CRLF to the last line of both of these cases, so
here is what gets hashed in all of these additional cases:
6) ordinary message with <body> of >1 non-empty line, not ending in CRLF
---------------------
somethingCRLF
anythingCRLF
---------------------
7) ordinary message with <body> of 1 non-empty line, not ending in CRLF
---------------------
anythingCRLF
---------------------
_______________________________________________
NOTE WELL: This list operates according to
http://mipassoc.org/dkim/ietf-list-rules.html