On May 27, 2010, at 9:03 PM, Dave CROCKER wrote: > > > On 5/27/2010 2:22 PM, Steve Atkins wrote: >> I'll write up the methodology in a little more detail, but out of my sample > > eager to see the method description. not lots of detail, just the gist of > what > criteria created each of the 4 values.
Sure. It was a very quick and rough analysis, based just on my mailboxes. I assumed that all signing, DNS publication, DKIM checking and ADSP checking was performed perfectly. I assumed that any modification of the body or subject of the mail after it was sent would invalidate a DKIM signature. I did make a few simplifications to speed the analysis, but the effect of those was solely to reduce the number of phishing emails not rejected by ADSP that were counted. My mailboxes are pre-categorised in a bunch of ways, such that it's easy for me to extract transactional mail, mail from discussion lists, mail from marketing lists and junk mail (including phishing). > >> the initial data is: >> >> Legitimate email from paypal: >> >> 72% rejected by ADSP >> 28% not rejected For these two groups I extracted all mail that had an @paypal.com email address that was categorized as legitimate email. I inspected all of them by hand, and they were a mixture of transactional notifications from paypal in response to payments, direct 1:1 mail from paypal.com employees and mail from paypal.com employees via mailing lists. I didn't actually check signatures, just considered the 1:1 mail and transactional mail as "not rejected" and considered the mail sent through mailing lists that I know modify the content as "rejected by ADSP". >> >> Phishing emails using "paypal" in the From line: >> >> 39% rejected by ADSP >> 61% not rejected. For this I extracted all the email categorised as "junk mail" that included the string "paypal" in some case or other in the RFC2822 From: field. That includes any use of "paypal" in the local part or domain part of the email, or in the "friendly from". It excludes any phish emails that didn't include the term paypal in the From: field, or which used B or Q-encoding or which used homoglyphs or misspellings of paypal. (This will exclude some phishes that would pass ADSP, but will not exclude any that would be rejected by ADSP). I checked them quickly by eye - all appeared to be paypal phishes. Of those, I classified those where the from address was @paypal.com as "rejected by ADSP" and those where it wasn't as "not rejected". Paypal is rather a special case, as they actively register many, many domains in a lot of TLDs that contain the word paypal or some misspelling of it, both proactively and in response to enforcement. I didn't consider those domains as triggering an ADSP rejection for a number of reasons. One is that many (most?) of them would have been acquired by paypal though enforcement action after the phishes were sent, and the other is that it's a behaviour (registering a huge number of domains purely to deny them to others) that's atypical and that doesn't scale. Havning said that, I did spot check quite a lot of the phishes that I'd tagged as "not rejected" and the vast majority weren't using domains I'd expect paypal to have proactively reserved (paypal.net, for instance) - they were mostly using the word "paypal" in the friendly from, the local part or a subdomain of the domain part. Of those that weren't of that form many were things like "@paypal-access.com" and suchlike. So I think those two numbers are likely accurate to within a few percent or better. > > This is pretty interesting data. It declares both FPs and FNs with ADSP, > which > certainly ain't part of any model I ever heard in support of its use. I expect that it would improve drastically in some respects with more widespread use of ADSP - I'd expect paypal employees to migrate to using a non-paypal.com domain for their email, for example. Also, my mailbox is more typical of someone in the industry than a consumer mailbox as, well, I get some mail from paypal.com that's neither transactional nor bulk. Someone who was a pure consumer at a major ISP who didn't use mailing lists or forwarding services and had no interaction with anyone at paypal other than via their bulk email system would see a much lower FP rate. > >> It's also based on sender behaviour before there's significant actual >> filtering via ADSP. I would expect less mail, both legitimate and >> illegitimate, >> to be rejected by ADSP as time went on. > > Given that a standard carries strategic costs in terms of development, > implementation and deployment (real dollars and time) one would think that > its > level of benefit should not decay, or at least not quickly. Since it takes > years to become useful it should take quite a few years before it becomes > useless... +1 Cheers, Steve _______________________________________________ NOTE WELL: This list operates according to http://mipassoc.org/dkim/ietf-list-rules.html
