Re: ocrtext vs FuzzyOCR?

2006-10-30 Thread James Lay
On Mon, 30 Oct 2006 07:19:44 -0800
Jeff Chan [EMAIL PROTECTED] wrote:

 Does anyone have any opinions on which of these is better:
 
   http://wiki.apache.org/spamassassin/CustomPlugins
 
 OCR scanner and image validator SA-plugin
 Checks for specific keywords in gif/jpg/png attachments, using
 gocr. This can be used to detect spam that puts all the real
 contect in an attached image, accompanied with random text and
 html (no URL's, etc). There are also various rules to validate
 attached images and detect forged content types or broken images.
 This plugin needs SpamAssassin 3.1.1 or later. The version 2.0 is
 able to defeat recent gif animations which use gif tricks to
 avoid OCR.
 Created by: Martin Blapp
 Contact: mb -at- imp -dot- ch
 License Type: BSD
 Status: active
 Available at: [WWW] http://antispam.imp.ch/patches/patch-ocrtext
 Note: Feedback and new sample images are welcome. Please test and
 send reports.
 
 
 Fuzzy OCR Plugin
 Derived from OcrPlugin (see above), but has many feature
 enhancements, including an approximate matching algorithm to
 compensate recognition errors and obfuscation, support for broken
 gifs, jpeg and png, dynamic scoring, automatic content-type
 independant format detection and many more.
 Created by: Christian Holler
 Contact: decoder_at_own-hero_dot_net
 License Type: Same as SpamAssassin
 Status: active
 Available at: FuzzyOcrPlugin
 Note: Feedback and new sample images are welcome. Please test and
 send reports. 
 
 Jeff C.
 -- 

I'd like to see something on this myself.  The segfault patch for Fuzzy
OCR failed, so I stopped right there as I wasn't sure what to do next.

James


Re: ocrtext vs FuzzyOCR?

2006-10-30 Thread decoder
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



James Lay wrote:
 On Mon, 30 Oct 2006 07:19:44 -0800 Jeff Chan [EMAIL PROTECTED]
 wrote:

 Does anyone have any opinions on which of these is better:

 http://wiki.apache.org/spamassassin/CustomPlugins

 OCR scanner and image validator SA-plugin Checks for specific
 keywords in gif/jpg/png attachments, using gocr. This can be used
 to detect spam that puts all the real contect in an attached
 image, accompanied with random text and html (no URL's, etc).
 There are also various rules to validate attached images and
 detect forged content types or broken images. This plugin needs
 SpamAssassin 3.1.1 or later. The version 2.0 is able to defeat
 recent gif animations which use gif tricks to avoid OCR. Created
 by: Martin Blapp Contact: mb -at- imp -dot- ch License Type: BSD
 Status: active Available at: [WWW]
 http://antispam.imp.ch/patches/patch-ocrtext Note: Feedback and
 new sample images are welcome. Please test and send reports.


 Fuzzy OCR Plugin Derived from OcrPlugin (see above), but has many
 feature enhancements, including an approximate matching algorithm
 to compensate recognition errors and obfuscation, support for
 broken gifs, jpeg and png, dynamic scoring, automatic
 content-type independant format detection and many more. Created
 by: Christian Holler Contact: decoder_at_own-hero_dot_net License
 Type: Same as SpamAssassin Status: active Available at:
 FuzzyOcrPlugin Note: Feedback and new sample images are welcome.
 Please test and send reports.

 Jeff C. --

 I'd like to see something on this myself.  The segfault patch for
 Fuzzy OCR failed, so I stopped right there as I wasn't sure what to
 do next.

This is no patch for FuzzyOcr but for gocr. You will need it with
every OCR plugin that uses gocr... It should work with version 0.40

Best regards,

Chris

 James

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFRiU2JQIKXnJyDxURAhB4AJ4vDRdlck+1I0D0HSNu0AFikgn13QCffOyi
0Tq0HJJvW7lrUGUKEKwX/EE=
=xWpz
-END PGP SIGNATURE-


Re: ocrtext vs FuzzyOCR?

2006-10-30 Thread James Lay
On Mon, 30 Oct 2006 17:15:51 +0100
decoder [EMAIL PROTECTED] wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 
 
 James Lay wrote:

 
  Jeff C. --
 
  I'd like to see something on this myself.  The segfault patch for
  Fuzzy OCR failed, so I stopped right there as I wasn't sure what to
  do next.
 
 This is no patch for FuzzyOcr but for gocr. You will need it with
 every OCR plugin that uses gocr... It should work with version 0.40
 
 Best regards,
 
 Chris
 
  James
 

Interesting.  Here's what I get patching gocr-0.41.  Patched fine
with 0.40 though.  Guess this is just an FYI really

 patching file src/pgm2asc.c
Hunk #1 FAILED at 1200.
Hunk #2 succeeded at 1719 with fuzz 2 (offset 466 lines).
1 out of 2 hunks FAILED -- saving rejects to file src/pgm2asc.c.rej

James