On Sat, Nov 27, 2021 at 7:34 PM Ned Freed <[email protected]> wrote:
> > It appears that Wei Chuang <[email protected]> said: > > > If the RFC2045 canonical representation at the final destination can > be the > > > same as the canonical representation at the original sender, ... > > > When we were working on DKIM canonicalization we had lengthy discussions > about > > what to do about MIME and we decided not to even try. > > A mistake IMO. > > > There is no canonical > > representation of a MIME message and nobody to my knowledge has ever > tried to > > describe what it would mean for two MIME messages to be equivalent, > since they > > could vary in a fantastic number of ways. > > First, a caonnical form doesn't have to produce a 100% reliable equivalency > test in order to be useful. > > Second, there can be more to a hash computation than a canonical form. This > is especially true given that a MIME message is a tree. > > > Part separators can change, the > > pieces of multipart/whatever might change, line breaks in > quoted-printable > > and base64 can change, spacing and capitalization of headers can change, > and > > that's just what I can think of in two minutes. > > If you treat the message as a Merkle tree with: > > o Separate header and body hashes > o Decoding message bodies prior to hashing > o Applying the already-defined unfolding/capitalization stuff from DKIM > to part headers. > o Removing the CTE field and boundary value from CT fields in the header > > You end up with a value that's: > > o Invariant in regards to part separator changes > o Invariant in regards to CTE changes > o Invariant in regards to many/most common header changes > o Allows for rapid computation of hashes for large numbers of large > messages > that share common content. > > Which I note takes care of your list. > This approach and benefit was what I was thinking could be feasible as well. The cited draft-kucherawy-dkim-list-canon <https://datatracker.ietf.org/doc/html/draft-kucherawy-dkim-list-canon> draft notes your contribution to the concept described there i.e. to perform hashing as a mime-tree (though that draft doesn't do content transport decoding). > But the question is, as always, whether or not defining such a thing is > worth > the trouble. At this point I think the answer is "no". > What type of concern do you have? Is it algorithmic complexity? Or runtime or header size overhead? -Wei
_______________________________________________ dmarc mailing list [email protected] https://www.ietf.org/mailman/listinfo/dmarc
