> -----Original Message----- > From: Steven W. Orr [mailto:[EMAIL PROTECTED] > Sent: Monday, April 02, 2007 11:01 AM > To: spamassassin-users > Subject: Fundamental question about spam image processing. > > > On Friday I attended the annual Spam Conference at MIT. While > there, I > spoke with a person who was an employee of Sophos. They are > very proud of > the proprietary spam filtering they do. We talked about SA > and FuzzyOCR > and I learned that they do extremely accurate spam analysis on image > attachments without OCR. I was very intrigued because > FuzzyOCR AFAICT is > hugely CPU intensive. I tried running it at home and it > worked for me (to > a point) but I can't imagine this being viable in an > industrial setting. > > It turns out that the basis for their analysis is to look at > the size of > the image as well as the number of colors. 99.99% of all spam > images have > less than 16 colors. Once they found an image with 22 colors. > This sounds > like a dirt cheap way to get a huge boost in spam > recognition. They may > have other tricks they do, but I just wanted to report what I learned. > > Can we do this?
Dallas's imageinfo plugin does sort of this already. And CRAP! I forgot about the MIT conference!! :( Thanks, Chris Santerre SysAdmin and Spamfighter www.rulesemporium.com www.uribl.com
