Henrik Krohns wrote: > I don't get it.. unless you have some big honeypot, maybe 5% of traffic > contain small images to be OCRd. If your server can't handle that, I guess > it's running out of juice anyway. :)
Well... yeah. <g> The basic problem is that all the other garbage (with the occasional inevitable exception) is getting caught by Clam (viruses and most phishes) or SpamAssassin (all but a few text-based spams. I've found *enough* similarities in the raw binary image data to usefully make signatures for a lot of what is otherwise getting through; at the moment this is just a stopgap until these machines can be retired. However, in the long run, OCR to feed the text to SpamAssassin's other rules is a better solution; it's much more flexible. -kgd _______________________________________________ http://lurker.clamav.net/list/clamav-users.html
