Re: Image spam - FuzzyOCR?

2016-09-02 Thread RW
On Fri, 02 Sep 2016 10:19:22 +0700 Olivier wrote: > > Not really, he just said it matches against a word list. My point is > > that out of the several SA OCR plugins that have been written, > > FuzzyOCR is the one that's specifically designed for doing fuzzy > > matching on a finite word list. If

Re: Image spam - FuzzyOCR?

2016-09-01 Thread Olivier
> Not really, he just said it matches against a word list. My point is > that out of the several SA OCR plugins that have been written, FuzzyOCR > is the one that's specifically designed for doing fuzzy matching on a > finite word list. If you just pass the OCR output to Bayes or add it to > the

Re: Image spam - FuzzyOCR?

2016-09-01 Thread RW
On Thu, 1 Sep 2016 15:16:37 +0200 Matus UHLAR - fantomas wrote: > >> On Thu, Sep 1, 2016 at 12:27 AM, Olivier > >> wrote: > >> > I am running it, it does not do a very good job at extracting the > >> > text from the images. Then it uses it's own list of keywords to

RE: Image spam - FuzzyOCR?

2016-09-01 Thread Richard Mealing
>-Original Message- >From: Matus UHLAR - fantomas [mailto:uh...@fantomas.sk] >Sent: Thursday, September 1, 2016 14:30 >To: users@spamassassin.apache.org >Subject: Re: Image spam - FuzzyOCR? >>On Wed, 31 Aug 2016 12:55:15 + Richard Mealing wrote: >>>

Re: Image spam - FuzzyOCR?

2016-09-01 Thread Matus UHLAR - fantomas
On Wed, 31 Aug 2016 12:55:15 + Richard Mealing wrote: 2) I'm getting some horny date spam coming through with just images and text inside an image at the bottom. My bayes seems to be scoring this with -1.90 Bayes_00. I keep sending this to my database as spam but I'm not sure how many I

Re: Image spam - FuzzyOCR?

2016-09-01 Thread RW
On Wed, 31 Aug 2016 12:55:15 + Richard Mealing wrote: > 2) I'm getting some horny date spam coming through with just > images and text inside an image at the bottom. My bayes seems to be > scoring this with -1.90 Bayes_00. I keep sending this to my database > as spam but I'm not sure how

Re: Image spam - FuzzyOCR?

2016-09-01 Thread Matus UHLAR - fantomas
On Thu, Sep 1, 2016 at 12:27 AM, Olivier wrote: > I am running it, it does not do a very good job at extracting the > text from the images. Then it uses it's own list of keywords to > detect spam: to me it's the biggest problem, it should push back > the text to

Re: Image spam - FuzzyOCR?

2016-09-01 Thread RW
On Thu, 1 Sep 2016 06:23:37 -0400 Mauricio Tavares wrote: > On Thu, Sep 1, 2016 at 12:27 AM, Olivier > wrote: > > I am running it, it does not do a very good job at extracting the > > text from the images. Then it uses it's own list of keywords to > > detect spam:

Re: Image spam - FuzzyOCR?

2016-09-01 Thread li...@rhsoft.net
Am 01.09.2016 um 12:23 schrieb Mauricio Tavares: I do agree that the OCR program should be doing the OCR'ing and the text filtering should be left to a program that does that for a living. In the modern, systemd world this is of course an ancient and outdated design philosophy this is simply

Re: Image spam - FuzzyOCR?

2016-09-01 Thread Mauricio Tavares
On Thu, Sep 1, 2016 at 12:27 AM, Olivier wrote: > Richard, > >> I am looking at Fuzzy ocr to detect more image spam and I had a couple >> of questions; > > FuzzyOCR does not detect image spam per se, it detects spam text in an > image. To classify image spam, you

Re: Image spam - FuzzyOCR?

2016-08-31 Thread Olivier
Richard, > I am looking at Fuzzy ocr to detect more image spam and I had a couple > of questions; FuzzyOCR does not detect image spam per se, it detects spam text in an image. To classify image spam, you could consider image Cerberus that does a classification on images metadata (size, presence

Image spam - FuzzyOCR?

2016-08-31 Thread Richard Mealing
Hi everyone, I am looking at Fuzzy ocr to detect more image spam and I had a couple of questions; 1) Is this being used? Does it detect image spam, or should I be looking at something else? 2) I'm getting some horny date spam coming through with just images and text inside an

Image Spam: FuzzyOcr doesn't hit

2009-08-24 Thread Toni Mueller
Hi, I've installed FuzzyOcr and all OCR programs that I could find, which apparently resulted in tesseract being chosen, but when I run spamassassin -D on a message containing image spam, I can only see that the FuzzyOcr plugin is being called, that it creates it's databases, but nothing else,

Re: Image Spam: FuzzyOcr doesn't hit

2009-08-24 Thread Romain Dolbeau
Toni Mueller support-spamassas...@oeko.net wrote: I've installed FuzzyOcr and all OCR programs that I could find, which apparently resulted in tesseract being chosen, Normally all are being run. The default settig is to stop when one has returned a 'spam' status (i.e. it runs everything for

Re: Image Spam: FuzzyOcr doesn't hit

2009-08-24 Thread Toni Mueller
Hi Romain, On Mon, 24.08.2009 at 13:25:33 +0200, Romain Dolbeau rom...@dolbeau.org wrote: Normally all are being run. The default settig is to stop when one has returned a 'spam' status (i.e. it runs everything for 'ham'). ok. Check the detailed log of FuzzyOcr (not of SA) ; it's likely

Re: Image Spam: FuzzyOcr doesn't hit

2009-08-24 Thread RW
On Mon, 24 Aug 2009 12:51:53 +0200 Toni Mueller support-spamassas...@oeko.net wrote: Hi, I've installed FuzzyOcr and all OCR programs that I could find, which apparently resulted in tesseract being chosen, but when I run spamassassin -D on a message containing image spam, I can only see