Re: Image spam - FuzzyOCR?

2016-09-02 Thread RW
On Fri, 02 Sep 2016 10:19:22 +0700 Olivier wrote: > > Not really, he just said it matches against a word list. My point is > > that out of the several SA OCR plugins that have been written, > > FuzzyOCR is the one that's specifically designed for doing fuzzy > > matching on a finite word list. If

Re: Image spam - FuzzyOCR?

2016-09-01 Thread Olivier
> Not really, he just said it matches against a word list. My point is > that out of the several SA OCR plugins that have been written, FuzzyOCR > is the one that's specifically designed for doing fuzzy matching on a > finite word list. If you just pass the OCR output to Bayes or add it to > the b

Re: Image spam - FuzzyOCR?

2016-09-01 Thread RW
On Thu, 1 Sep 2016 15:16:37 +0200 Matus UHLAR - fantomas wrote: > >> On Thu, Sep 1, 2016 at 12:27 AM, Olivier > >> wrote: > >> > I am running it, it does not do a very good job at extracting the > >> > text from the images. Then it uses it's own list of keywords to > >> > detect spam: to me it'

RE: Image spam - FuzzyOCR?

2016-09-01 Thread Richard Mealing
>-Original Message- >From: Matus UHLAR - fantomas [mailto:uh...@fantomas.sk] >Sent: Thursday, September 1, 2016 14:30 >To: users@spamassassin.apache.org >Subject: Re: Image spam - FuzzyOCR? >>On Wed, 31 Aug 2016 12:55:15 + Richard Mealing wrote: >>>

Re: Image spam - FuzzyOCR?

2016-09-01 Thread Matus UHLAR - fantomas
On Wed, 31 Aug 2016 12:55:15 + Richard Mealing wrote: 2) I'm getting some horny date spam coming through with just images and text inside an image at the bottom. My bayes seems to be scoring this with -1.90 Bayes_00. I keep sending this to my database as spam but I'm not sure how many I

Re: Image spam - FuzzyOCR?

2016-09-01 Thread RW
On Wed, 31 Aug 2016 12:55:15 + Richard Mealing wrote: > 2) I'm getting some horny date spam coming through with just > images and text inside an image at the bottom. My bayes seems to be > scoring this with -1.90 Bayes_00. I keep sending this to my database > as spam but I'm not sure how

Re: Image spam - FuzzyOCR?

2016-09-01 Thread Matus UHLAR - fantomas
On Thu, Sep 1, 2016 at 12:27 AM, Olivier wrote: > I am running it, it does not do a very good job at extracting the > text from the images. Then it uses it's own list of keywords to > detect spam: to me it's the biggest problem, it should push back > the text to SpamAssassin and let SA rules deci

Re: Image spam - FuzzyOCR?

2016-09-01 Thread RW
On Thu, 1 Sep 2016 06:23:37 -0400 Mauricio Tavares wrote: > On Thu, Sep 1, 2016 at 12:27 AM, Olivier > wrote: > > I am running it, it does not do a very good job at extracting the > > text from the images. Then it uses it's own list of keywords to > > detect spam: to me it's the biggest problem,

Re: Image spam - FuzzyOCR?

2016-09-01 Thread li...@rhsoft.net
Am 01.09.2016 um 12:23 schrieb Mauricio Tavares: I do agree that the OCR program should be doing the OCR'ing and the text filtering should be left to a program that does that for a living. In the modern, systemd world this is of course an ancient and outdated design philosophy this is simply

Re: Image spam - FuzzyOCR?

2016-09-01 Thread Mauricio Tavares
On Thu, Sep 1, 2016 at 12:27 AM, Olivier wrote: > Richard, > >> I am looking at Fuzzy ocr to detect more image spam and I had a couple >> of questions; > > FuzzyOCR does not detect image spam per se, it detects spam text in an > image. To classify image spam, you could consider image Cerberus that

Re: Image spam - FuzzyOCR?

2016-08-31 Thread Olivier
Richard, > I am looking at Fuzzy ocr to detect more image spam and I had a couple > of questions; FuzzyOCR does not detect image spam per se, it detects spam text in an image. To classify image spam, you could consider image Cerberus that does a classification on images metadata (size, presence o

Image spam - FuzzyOCR?

2016-08-31 Thread Richard Mealing
Hi everyone, I am looking at Fuzzy ocr to detect more image spam and I had a couple of questions; 1) Is this being used? Does it detect image spam, or should I be looking at something else? 2) I'm getting some horny date spam coming through with just images and text inside an image

Re: Image Spam: FuzzyOcr doesn't hit

2009-08-24 Thread RW
On Mon, 24 Aug 2009 12:51:53 +0200 Toni Mueller wrote: > > Hi, > > I've installed FuzzyOcr and "all" OCR programs that I could find, > which apparently resulted in tesseract being chosen, but when I run > "spamassassin -D " on a message containing image spam, I can only see > that the FuzzyOcr

Re: Image Spam: FuzzyOcr doesn't hit

2009-08-24 Thread Toni Mueller
Hi Romain, On Mon, 24.08.2009 at 13:25:33 +0200, Romain Dolbeau wrote: > Normally all are being run. The default settig is to stop when one has > returned a 'spam' status (i.e. it runs everything for 'ham'). ok. > Check the detailed log of FuzzyOcr (not of SA) ; it's likely it's > missing some

Re: Image Spam: FuzzyOcr doesn't hit

2009-08-24 Thread Romain Dolbeau
Toni Mueller wrote: > I've installed FuzzyOcr and "all" OCR programs that I could find, which > apparently resulted in tesseract being chosen, Normally all are being run. The default settig is to stop when one has returned a 'spam' status (i.e. it runs everything for 'ham'). > but when I run "s

Image Spam: FuzzyOcr doesn't hit

2009-08-24 Thread Toni Mueller
Hi, I've installed FuzzyOcr and "all" OCR programs that I could find, which apparently resulted in tesseract being chosen, but when I run "spamassassin -D " on a message containing image spam, I can only see that the FuzzyOcr plugin is being called, that it creates it's databases, but nothing els