Henrik Krohns wrote:
> I don't get it.. unless you have some big honeypot, maybe 5% of traffic
> contain small images to be OCRd. If your server can't handle that, I guess
> it's running out of juice anyway. :)

Well... yeah.  <g>  The basic problem is that all the other garbage
(with the occasional inevitable exception) is getting caught by Clam
(viruses and most phishes) or SpamAssassin (all but a few text-based spams.

I've found *enough* similarities in the raw binary image data to
usefully make signatures for a lot of what is otherwise getting through;
 at the moment this is just a stopgap until these machines can be retired.

However, in the long run, OCR to feed the text to SpamAssassin's other
rules is a better solution;  it's much more flexible.

-kgd
_______________________________________________
http://lurker.clamav.net/list/clamav-users.html

Reply via email to