Paul Hutchings wrote: > I'm running Spamassassin on OpenSuse 10.2 and have just installed > FuzzyOCR. > > It appears to be working in that it scans/detects words in the supplied > test files. > > I noticed "spamassassin --lint" gives: > > [25313] warn: FuzzyOcr: Cannot find executable for pamthreshold
This one means you don't have a recent version of Netpbm, pamthreshold appeared around version 10.34 (I'm using 10.35.21). Some tests will not work, either install it or use a workaround (there are some posts about this, I don't use/know one). > [25313] warn: FuzzyOcr: Cannot find executable for tesseract Tesseract is optional, I just comment out line 100 of FuzzyOcr.cf : #focr_bin_helper tesseract > Which seems fair enough as I don't have them. > > Is it just a spurious warning though or do I need to be concerned? > > Also as a general question other than adding words to the wordlist as > and when, are there any "Must Know" tips n tricks for FuzzyOCR? I would recommend to at least read FuzzyOcr.cf so you see what can be controlled and get an idea of how things work. The default parametes, as you have seen, work fine... I would only check focr_enable_image_hashing (disabled by default, recommended set to 2), and focr_base_score (which is too high in my opinion, 5 is the default and there's a know bug that counts the same word as several repetitions so the count is not very reliable). -- René Berber