-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Theo Van Dinter wrote:
> On Mon, Oct 02, 2006 at 03:18:58PM +0100, Randal, Phil wrote:
>>> undetected). Wouldn't it be better to inject the detected
>>> text back to SA? There should be enough variants of spam
>>> worlds to let SA fuzzily catch the ones from images.
>> I think so.  Some of the words would be perfectly legitimate in the text
>> of emails but rarely found in attached legitimate images.
>>
>> Quite apart from the fact that Spamassassin isn't designed for
>> "reinjection".
>
> FWIW, 3.2 adds in support to have rendering of non-text parts.  So a plugin
> could, for instance, OCR text from an image, and then the normal body rules
> and such would be able to use that information.
>
This sounds great. Once I am back to continue the developing process
of FuzzyOcr, I might add an option to pass the text back to SA.
Combined with a new, more precise OCR engine like tesseract, this will
probably work very well. Unfortunately, there is currently a lot of
picture spam being sent around which won't be caught at all by
FuzzyOcr because they use new obfuscation technics with animated gifs
etc and I don't have the time atm to adjust the plugin to these...

Best regards

Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFIVIfJQIKXnJyDxURAlIlAKCCcaD5O43KmvAHUxcew85d7cE82wCgwbGG
NAd6j8vgv1pvV9zVBN+5oqE=
=LB3n
-----END PGP SIGNATURE-----

Reply via email to