Re: [ietf-dkim] Updated implementation report

Murray S. Kucherawy Fri, 01 Oct 2010 16:23:38 -0700

> -----Original Message-----
> From: [email protected] [mailto:[email protected]] 
> On Behalf Of Rolf E. Sonneveld
> Sent: Friday, October 01, 2010 2:24 PM
> To: [email protected]
> Subject: Re: [ietf-dkim] Updated implementation report
> 
> Remark: I'd suggest to transform the AOL data to also contain
> percentages, just like the results of the other (OpenDKIM statistics)
> project. It will make it easier to compare (the results of) the two
> 'projects' and in general, percentages give a better view on what was
> measured.


OK, I'll look into that for the next version.

> If this is the #1 reason that verifications fail, would there be room
> for a new canonicalization scheme, to improve verification rates? I
> know
> there are MTA's, implementing the principle of 'garbage in - garbage
> out', just like there are MTA's implementing the principle of 'be
> liberal in what you accept, be strict in what you emit'; the latter may
> add a missing Date field, or correct a syntactically wrong Date field,
> or modify To fields to match RFC5322 etc. This has been discussed
> before, and it is impossible to come up with a canonicalization scheme
> that addresses 100% of these modifications, but if we can address the
> top 5 or top 10 types of modifications (and hence reasons for
> verification failure), we might be able to further improve the
> verification score, dont' we? Murray, do we have any figures about the
> total percentage of DKIM signatures, which were invalidated by header
> modifications, and a complete list of which headers were modified? Are
> we talking about 5%, 1%, 0.1%, 0.01%?

I don't think the data support any conclusions about possible canonicalization 
improvements yet.  I also can't (after admittedly spending only about 90 
seconds on it) imagine what that might look like, short of something like 
"relaxed plus no punctuation outside angle brackets" or the like.

The OpenDKIM statistics collected so far show that of 135549 signatures 
received, 121821 passed (meaning 13728 failed).  There were also 6194 that 
passed (inasmuch as the header hashes lined up) but the body hash didn't.  So 
we see that failures caused by header changes beat body changes by just over 
two-to-one.

What we don't currently collect is a list of signed fields that were deleted in 
transit, nor do we collect a list of fields that were signed but not actually 
present when signing to prevent their later addition.  We also don't collect 
what exactly the various in-transit modifications were.  We also do only a 
limited approximate matching; if a field changes a great deal, we don't count 
it in order to avoid false positives (e.g. two Received: header fields could be 
compared and cause a false positive).  And since the contents of "z=" are 
assumed to be an accurate reflection of the signed header fields for the 
purposes of this study, we rely on people to add them and do so correctly.  So 
it's not a precise study, and we certainly don't collect enough information to 
say what the precise changes are that cause failures, but it still does reveal 
some interesting stuff.

> This raises another question: a DKIM verification failure in itself is
> not a problem: a spammer signing with an incorrect signature, or
> replaying old (DKIM) message headers with new spam content will cause
> verification failures, which is how it is supposed to be. However,
> another category of DKIM verification failures may have to do with
> header modifications by downstream MTA's, invalidating DKIM signatures.
> The question here is: how can we gather statistics about these two (and
> maybe there are more) different categories of verification failures, or
> how can we differentiate between these two (or more) categories?

I suspect the best you can do, short of getting everyone along the way to save 
copies of all signed traffic for a while, is to encourage wider deployment of 
"z=" by both signers and verifiers.  That'll also be hard to do though since it 
doesn't actually help anyone achieve anything, other than those of us 
interested in advancing the protocol, and it just makes headers bigger which 
sometimes runs into processing limits.  And several implementations probably 
didn't even bother to add support for it.

I think it's rare to find anyone validating signatures except at the sender and 
the receiver, so any mid-stream rewrites of stuff are happening silently.

Another thing that might be interesting to collect is a study of which MTAs are 
making which changes that break signatures, or which signature-breaking MLM 
actions are the most common.  These studies would be a lot harder to conduct, 
however.

_______________________________________________
NOTE WELL: This list operates according to 
http://mipassoc.org/dkim/ietf-list-rules.html

Re: [ietf-dkim] Updated implementation report

Reply via email to