Richard,

> I am looking at Fuzzy ocr to detect more image spam and I had a couple
> of questions;

FuzzyOCR does not detect image spam per se, it detects spam text in an
image. To classify image spam, you could consider image Cerberus that
does a classification on images metadata (size, presence of text, etc.)

> 1)      Is this being used? Does it detect image spam, or should I be
> looking at something else?

Yes. No, maybe.

I am running it, it does not do a very good job at extracting the text
from the images. Then it uses it's own list of keywords to detect spam:
to me it's the biggest problem, it should push back the text to
SpamAssassin and let SA rules decide what to do with it.

> 2)      I'm getting some horny date spam coming through with just
> images and text inside an image at the bottom. My bayes seems to be
> scoring this with -1.90 Bayes_00. I keep sending this to my database
> as spam but I'm not sure how many I need to feed it and I don't get
> much. Are there any other means of feeding bayes with image spam (or
> any spam really) from a source on the internet? Or is that a bad idea
> since that's not my spam?

The ideal plugin would be able to look at a picture and decide that it's
an horny date :) I remember we once had a student that wanted to work on
classifying picture by the amount of flesh to decide whether it was a
naked picture or not/ But I don't think he ever succeeded.

> 3)      If I use Fuzzy OCR on FreeBSD, how does it get updated?

I doubt FuzzyOCR ever gets updated, on FreeBSD or elsewhere.

> 4)      I installed it from the ports and I had to install tesseract
> or I got a dependency warning message. Now I still get a warning -
> warn: FuzzyOcr: Cannot find executable for gifinter - Is this normal?
> How should I omit this error since I can't find gifinter in the ports
> tree?

gifinter used to be part of /usr/ports/graphics/giflib
but the NEWS file mentions that:
Version 5.0.1
=============
Retirements
-----------
* gifinter is gone.  Use convert -interlace from the ImageMagick suite.

In my case, I still have an old executable of gifinter laying around,
but I think you would configure FuzzyOCF.cf with an approprate line of
the form:

focr_bin_gifinter /usr/local/bin/convert -interlace and the needed
parameters.

Best regards,

Olivier

Reply via email to