> -----Original Message-----
> From: Steven W. Orr [mailto:[EMAIL PROTECTED]
> Sent: Monday, April 02, 2007 11:01 AM
> To: spamassassin-users
> Subject: Fundamental question about spam image processing.
> 
> 
> On Friday I attended the annual Spam Conference at MIT. While 
> there, I 
> spoke with a person who was an employee of Sophos. They are 
> very proud of 
> the proprietary spam filtering they do. We talked about SA 
> and FuzzyOCR 
> and I learned that they do extremely accurate spam analysis on image 
> attachments without OCR. I was very intrigued because 
> FuzzyOCR AFAICT is 
> hugely CPU intensive. I tried running it at home and it 
> worked for me (to 
> a point) but I can't imagine this being viable in an 
> industrial setting.
> 
> It turns out that the basis for their analysis is to look at 
> the size of 
> the image as well as the number of colors. 99.99% of all spam 
> images have 
> less than 16 colors. Once they found an image with 22 colors. 
> This sounds 
> like a dirt cheap way to get a huge boost in spam 
> recognition. They may 
> have other tricks they do, but I just wanted to report what I learned.
> 
> Can we do this?

Dallas's imageinfo plugin does sort of this already. 

And CRAP! I forgot about the MIT conference!! :( 

Thanks,

Chris Santerre
SysAdmin and Spamfighter
www.rulesemporium.com
www.uribl.com


Reply via email to