-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Theo Van Dinter wrote: > On Mon, Oct 02, 2006 at 03:18:58PM +0100, Randal, Phil wrote: >>> undetected). Wouldn't it be better to inject the detected >>> text back to SA? There should be enough variants of spam >>> worlds to let SA fuzzily catch the ones from images. >> I think so. Some of the words would be perfectly legitimate in the text >> of emails but rarely found in attached legitimate images. >> >> Quite apart from the fact that Spamassassin isn't designed for >> "reinjection". > > FWIW, 3.2 adds in support to have rendering of non-text parts. So a plugin > could, for instance, OCR text from an image, and then the normal body rules > and such would be able to use that information. > This sounds great. Once I am back to continue the developing process of FuzzyOcr, I might add an option to pass the text back to SA. Combined with a new, more precise OCR engine like tesseract, this will probably work very well. Unfortunately, there is currently a lot of picture spam being sent around which won't be caught at all by FuzzyOcr because they use new obfuscation technics with animated gifs etc and I don't have the time atm to adjust the plugin to these...
Best regards Chris -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFIVIfJQIKXnJyDxURAlIlAKCCcaD5O43KmvAHUxcew85d7cE82wCgwbGG NAd6j8vgv1pvV9zVBN+5oqE= =LB3n -----END PGP SIGNATURE-----