Nels Lindquist wrote: > As far as spammers obfuscating their images, couldn't that be worked > around by tying OCR into the bayesian system?
I think the original idea was to obfuscate the images so people could read the text, but OCR tools wouldn't be able to. > Then obfuscation wouldn't matter--whatever munging is done to a > particular image would produce the same OCR strings, before and > after bayes training. You wouldn't need to know particular strings > to match beforehand in that case. True, but you'd need to see enough of them to train your Bayes engine. > That would force image spammers would to produce a unique obfuscated > graphic for every single message, which seems like an expensive > proposition. Sadly, serious spammers have virtually unlimited computing resources. There are armies of thousands of zombie machines out there waiting to do their masters' bidding... Adding random noise that fools OCR tools but leaves the images legible for humans probably isn't that computationally expensive. The only way to defeat image spam would be if Microsoft modifies Outlook not to display HTML or images, and for Thunderbird et al to follow suit. Anyone care to bet on the odds of that happening? :-( Regards, David. _______________________________________________ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list [email protected] http://lists.roaringpenguin.com/mailman/listinfo/mimedefang

