-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Matthias Keller wrote: > decoder wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> >> decoder wrote: >> >>> Hello there, >>> >>> I have improved the original OcrPlugin (found at >>> http://wiki.apache.org/spamassassin/OcrPlugin), so it contains >>> fuzzy matching. Like that, mistakes made by the OCR recognition >>> or intentional obfuscations in the text don't make the >>> recognition impossible. This is being done with a relative >>> distance calculation between the pattern (word from a given >>> word list) and a line in the recognized input. Also, the plugin >>> uses dynamic scoring (more matched words means more score, this >>> can be adjusted in the source). >>> >>> You can find a full description and an example in the wiki >>> under: >>> >>> http://wiki.apache.org/spamassassin/FuzzyOcrPlugin >>> >>> >>> Ideas for improvements or critics are always welcome :) >>> >>> >>> Best regards, >>> >>> >>> Chris >>> >> >> Hello again, >> >> >> I just released a new version which contains all suggestions made >> here on the mailing list. Changelog: >> >> * Added scoring for wrong content-type * Added scoring for broken >> gif images * Added configuration for helper applications * Added >> autodisable_score feature to disable the OCR engine if the >> message has already enough points >> >> >> You can now obtain the plugin as a tarball, the download URL is >> at the end of the wiki page. >> (http://wiki.apache.org/spamassassin/FuzzyOcrPlugin) >> >> All new options in the config file, especially score adjustments >> for the new features, are explained there as well and in the >> sample cf file. >> > Hi I get the following warnings when linting: [29661] warn: config: > warning: description exists for non-existent rule > FUZZY_OCR_CORRUPT_IMG [29661] warn: config: warning: description > exists for non-existent rule FUZZY_OCR_WRONG_CTYPE [29661] warn: > lint: 2 issues detected, please rerun with debug enabled for more > information >
Indeed, I didn't notice that. It runs fine though, I'll fix it anyway by putting the descriptions into the plugin config as to be parsed by the plugin. Thx -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE20FxJQIKXnJyDxURAuIdAJ9PccWoKPz7mL0MyMqoEN6UMTh5WQCff09N FMEIgWO7UpMe8ziacyS/tuo= =6czY -----END PGP SIGNATURE-----
