Charles, you missed the cases introduced by RFC 3030 and the CHUNKING
ESMTP extension.

Further comments below.

        Tony Hansen
        [EMAIL PROTECTED]

Charles Lindsey wrote:
> Let us be VERY careful here. Start from RFC 2822:
> 
> message         =       (fields / obs-fields)
>                         [CRLF body]
> body            =       *(*998text CRLF) *998text
> 
> So a <body> can be EMPTY, and its last line might not have a CRLF.
> 
> The CRLF following the header fields is NOT part of the <body>.
> 
> If the <body> is absent (indistinguishable from an empty <body>) that
> CRLF after the header fields can be omitted.
> 
> Now look at RFC 2821:
> 
>    The mail data is terminated by a line containing only a period, that
>    is, the character sequence "<CRLF>.<CRLF>" (see section 4.5.2).  This
>    is the end of mail data indication.  Note that the first <CRLF> of
>    this terminating sequence is also the <CRLF> that ends the final line
>    of the data (message text) or, if there was no data, ends the DATA
>    command itself.
> 
> So, even if you have a body with no CRLF, as permitted by RFC 2822, you
> can't actually transmit it by RFC 2821 

Correct, 2822 requires complete lines to be transmitted, that is, lines
ending in CRLF.

> (well, you might transmit it by
> UUCP, and you might encapsulate in in a message/rfc822 within some
> multipart).

Add in RFC 3030 and CHUNKING, and you get the ways of transmitting
messages using ESMTP that are RFC 2822-compliant but not RFC
2821-compliant. Add in the MIME RFCs and you also eliminate the
requirements that the lines be limited to 998 characters and consist of
text.

> So we have the following cases. The dotted lines enclose what is, by RFC
> 2822 definition, the <body>, and is therefore what will get hashed or
> canonicalized by dkim-base, as currently worded. The ".CRLF" is the RFC
> 2821 DATA terminator.

Correct.

> 1) ordinary message with <body> of 1 non-empty line:
> 
> Last-Header: foobarCRLF
> CRLF
> ---------------------
> barbazCRLF
> ---------------------
> .CRLF
> 
> 2) <body> consisting of 2 empty lines
> 
> Last-Header: foobarCRLF
> CRLF
> ---------------------
> CRLF
> CRLF
> ---------------------
> .CRLF
> 
> 3) <body> consisting of 1 empty line
> 
> Last-Header: foobarCRLF
> CRLF
> ---------------------
> CRLF
> ---------------------
> .CRLF
> 
> 4) <body> containing no lines
> 
> Last-Header: foobarCRLF
> CRLF
> ---------------------
> ---------------------
> .CRLF
> 
> 5) message with absent <body>
> 
> Last-Header: foobarCRLF
> .CRLF
> 
> Now apply simple canonicalization to all those cases, using:
> 
>    "In more formal terms, the "simple" body canonicalization algorithm
>     converts "0*CRLF" at the end of the body to a single "CRLF"."
> 
> Making the entirely reasonable assumption that "body" means exactly what
> RFC 2822 defines it to mean, then here is what gets hashed in all of
> those cases:
> 
> 1) ordinary message with <body> of 1 non-empty line:
> ---------------------
> barbazCRLF
> ---------------------
> 
> 2) <body> consisting of 2 empty lines
> ---------------------
> CRLF
> ---------------------
> 
> 3) <body> consisting of 1 empty line
> ---------------------
> CRLF
> ---------------------
> 
> 4) <body> containing no lines
> ---------------------
> CRLF
> ---------------------
> 
> 5) message with absent <body>
> ---------------------
> ---------------------

I contend that the current wording in base-07 also requires that example
5 canonicalize into a

---------------------
CRLF
---------------------

Even when the body doesn't exist, it still must be treated as having 0
lines following, which still canonicalize to a CRLF.

But even with my contention on case #5, I don't disagree with your
conclusions here:

> That is undoubtedly what the "formal terms" in dkim-base undoubtedly SAY.
> 
> It is NOT what the "informal" words in dkim-base say.
> It is NOT what version -01 of DK says.
> It is NOT what version -06 of DK says.
> It is NOT what Eric's three examples claim.
> It is entirely possible that is is NOT what dkim-base was INTENDED to say.

That's why the issue was raised.

I firmly believe that we *intended* to canonicalize each of these cases
into the empty body

---------------------
---------------------

        Tony Hansen
        [EMAIL PROTECTED]

PS. For completeness, the only missing cases, after taking into
consideration RFC 3030 and MIME, are as follows. *These* are the reason
that the 0*CRLF rule was added and where it needs to be applied:

6) ordinary message with <body> of >1 non-empty line, not ending in CRLF

Content-Type: binary
Last-Header: foobarCRLF
CRLF
---------------------
somethingCRLF
anything
---------------------

7) ordinary message with <body> of 1 non-empty line, not ending in CRLF

Content-Type: binary
Last-Header: foobarCRLF
CRLF
---------------------
anything
---------------------

Now apply simple canonicalization to all those cases, using:

   "In more formal terms, the "simple" body canonicalization algorithm
    converts "0*CRLF" at the end of the body to a single "CRLF"."

This winds up adding a CRLF to the last line of both of these cases, so
here is what gets hashed in all of these additional cases:

6) ordinary message with <body> of >1 non-empty line, not ending in CRLF

---------------------
somethingCRLF
anythingCRLF
---------------------

7) ordinary message with <body> of 1 non-empty line, not ending in CRLF

---------------------
anythingCRLF
---------------------

_______________________________________________
NOTE WELL: This list operates according to 
http://mipassoc.org/dkim/ietf-list-rules.html

Reply via email to