On Sat, Nov 27, 2021 at 7:34 PM Ned Freed <[email protected]> wrote:

> > It appears that Wei Chuang  <[email protected]> said:
> > > If the RFC2045 canonical representation at the final destination can
> be the
> > > same as the canonical representation at the original sender, ...
>
> > When we were working on DKIM canonicalization we had lengthy discussions
> about
> > what to do about MIME and we decided not to even try.
>
> A mistake IMO.
>
> > There is no canonical
> > representation of a MIME message and nobody to my knowledge has ever
> tried to
> > describe what it would mean for two MIME messages to be equivalent,
> since they
> > could vary in a fantastic number of ways.
>
> First, a caonnical form doesn't have to produce a 100% reliable equivalency
> test in order to be useful.
>
> Second, there can be more to a hash computation than a canonical form. This
> is especially true given that a MIME message is a tree.
>
> > Part separators can change, the
> > pieces of multipart/whatever might change, line breaks in
> quoted-printable
> > and base64 can change, spacing and capitalization of headers can change,
> and
> > that's just what I can think of in two minutes.
>
> If you treat the message as a Merkle tree with:
>
> o Separate header and body hashes
> o Decoding message bodies prior to hashing
> o Applying the already-defined unfolding/capitalization stuff from DKIM
>   to part headers.
> o Removing the CTE field and boundary value from CT fields in the header
>
> You end up with a value that's:
>
> o Invariant in regards to part separator changes
> o Invariant in regards to CTE changes
> o Invariant in regards to many/most common header changes
> o Allows for rapid computation of hashes for large numbers of large
> messages
>   that share common content.
>
> Which I note takes care of your list.
>

This approach and benefit was what I was thinking could be feasible as
well.  The cited draft-kucherawy-dkim-list-canon
<https://datatracker.ietf.org/doc/html/draft-kucherawy-dkim-list-canon>
draft notes
your contribution to the concept described there i.e. to perform hashing as
a mime-tree (though that draft doesn't do content transport decoding).


> But the question is, as always, whether or not defining such a thing is
> worth
> the trouble. At this point I think the answer is "no".
>

What type of concern do you have?  Is it algorithmic complexity?  Or
runtime or header size overhead?

-Wei
_______________________________________________
dmarc mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dmarc

Reply via email to