 Sadly, I have more FP data for you. :(

 Here's one specific example (just a single very long line from
 one corpse):
  background-image: url("data:image/svg+xml;charset=utf8,%3Csvg
  width='104px' height='82px' viewBox='0 0 104 82' version='1.1'

Ok, I excluded image data from URI_DATA. This should reduce FPs without hurting spam/phish detection (I hope).

...and now __URI_DATA isn't hitting *anything*.

I suspect that the only data: URLs in the masscheck corpora are for embedded images. This makes sense if they're being used primarily for spearphishing.

Chip, could you send me some spamples of non-image data: messages offlist? The only ones I have anywhere are images.


