On Thu, 15 Sep 2016, John Hardin wrote:

On Wed, 15 Sep 2016, Chip M. wrote:

 Sadly, I have more FP data for you. :(

 Here's one specific example (just a single very long line from
 one corpse):
  background-image: url("data:image/svg+xml;charset=utf8,%3Csvg
  width='104px' height='82px' viewBox='0 0 104 82' version='1.1'
xmlns='http://www.w3.org/2000/svg'

Ok, I excluded image data from URI_DATA. This should reduce FPs without hurting spam/phish detection (I hope).

...and now __URI_DATA isn't hitting *anything*.

I suspect that the only data: URLs in the masscheck corpora are for embedded images. This makes sense if they're being used primarily for spearphishing.

Chip, could you send me some spamples of non-image data: messages offlist? The only ones I have anywhere are images.

Thanks!

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Politicians never accuse you of "greed" for wanting other people's
  money, only for wanting to keep your own money.    -- Joseph Sobran
-----------------------------------------------------------------------
 Tomorrow: the 229th anniversary of the signing of the U.S. Constitution

Reply via email to