On Mon, Aug 14, 2006 at 08:46:51PM +0200, decoder wrote: > gocr features a nice parameter called -d. It is able to remove smaller > particles before scanning, compare these results:
So my problem with the OCR idea is that it inevitably gets to the point where we'd need to programatically solve the same graphics as used in CAPTCHAs, and then I don't think we're really focused on addressing the core issue any longer. It's mostly the same way in non-graphic spams -- catching the text may or may not be difficult with all the obfuscation and such that goes on. However, catching the fact that there's obfuscation is a good indication of spam. Just a thought. -- Randomly Generated Tagline: Capital Punishment means never having to say "YOU AGAIN?"
pgpyuM6dGsOBc.pgp
Description: PGP signature