Re: FuzzyOcr question

2008-01-14 Thread --[ UxBoD ]--
Is decoder (Chris) still developing FuzzyOCR ?

Regards,

-- 
--[ UxBoD ]--
// PGP Key: curl -s http://www.splatnix.net/uxbod.asc | gpg --import
// Fingerprint: F57A 0CBD DD19 79E9 1FCC A612 CB36 D89D 2C5A 3A84
// Keyserver: www.keyserver.net Key-ID: 0x2C5A3A84
// Phone: +44 845 869 2749 SIP Phone: [EMAIL PROTECTED]

- Original Message -
From: NFN Smith [EMAIL PROTECTED]
To: users@spamassassin.apache.org
Sent: 14 January 2008 17:35:30 o'clock (GMT) Europe/London
Subject: FuzzyOcr question

A couple of months ago, I updated FuzzyOcr to the current package 
version supported in Debian Stable (2.3b-1).

In the meantime, I notice that when there are hits on FuzzyOcr, the 
SpamAssassinReport.txt attachment is showing that I am getting hits on 
FuzzyOcr, and the number of points scored by hits, but in the 
Description, I'm getting only BODY:, and no listing of which words 
were actually hit. e.g.,

2.0 FUZZY_OCR  BODY:


I'm not finding anything in docs or FuzzyOcr.cf that seems to govern 
this one, and for debugging purposes, I'd really like to know what terms 
are getting hits or not.

What am I missing?

Smith


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



Re: FuzzyOcr question

2008-01-14 Thread Loren Wilton

Is decoder (Chris) still developing FuzzyOCR ?


I haven't seen any changes recently, nor any discussion on the FuzzyOCR 
mailing list.  But then I haven't seen a lot of OCR spams going by since the 
stock spams cut down in volume a while back.


I'd say its a good tool to keep around just to keep them from coming back!

   Loren




Re: FuzzyOcr question

2008-01-14 Thread NFN Smith

Loren Wilton wrote:

Is decoder (Chris) still developing FuzzyOCR ?


I haven't seen any changes recently, nor any discussion on the FuzzyOCR 
mailing list.  But then I haven't seen a lot of OCR spams going by since the 
stock spams cut down in volume a while back.


I'd say its a good tool to keep around just to keep them from coming back!


The volume of graphical spam that needs FuzzyOCR is pretty limited on my 
spamtraps, although a couple of weeks ago, I saw a couple of bursts of 
pump-and-dump.


However, there's a still a slow (but steady) volume of pillz spammers, 
and occasional watches and OEM getting through.


On the Pillz, there's one that looks like a Yambo one, and I finally 
tweaked my terms list enough that I'm getting a couple of FuzzyOcr 
points on that, but not quite enough to force a rejection.  There's also 
one pillz that's pretty offensive (but fairly infrequent) -- it got a 
couple of points, but I'd really like to get enough hits on that one to 
force rejection, so that my users don't see it.


Thus, I'd like to get a verification of what terms are actually getting 
hits.


Smith



Re: FuzzyOcr question

2008-01-14 Thread René Berber

NFN Smith wrote:

[snip]
On the Pillz, there's one that looks like a Yambo one, and I finally 
tweaked my terms list enough that I'm getting a couple of FuzzyOcr 
points on that, but not quite enough to force a rejection.  There's also 
one pillz that's pretty offensive (but fairly infrequent) -- it got a 
couple of points, but I'd really like to get enough hits on that one to 
force rejection, so that my users don't see it.


Thus, I'd like to get a verification of what terms are actually getting 
hits.


To do that you should save the spam and run it through spamassassin with 
'-D FuzzyOcr -x -t' parameters to see what matches and what words don't.


The FuzzyOcr.words file defines the words FuzzyOcr is looking for; most 
of us have a customized file: add words to that file, tweak factors, 
etc.  It all depends on what country/language you are using it.


BTW the recommended (in FuzzyOcr's site) version is 3.5.1 with some 
patches from the SVN repository.  The version you have has some issues 
with recent Spamassassin versions, like the one about the report being 
empty, or not formatted.

--
René Berber



RE: FuzzyOCR question

2006-11-17 Thread Giampaolo Tomassoni
  I'm brainstorming here tonight and I'm curious of 
 something.  When 
 you're using FuzzyOCR, is it called for every message that goes 
 through SA, 
 or just ones with gif attachments?

FuzzyOcr is invoked on every image on a message whenever the message itself 
doesn't reach a score threshold by other means.

Ie: if a spam is detected as such before running FuzzyOcr, the latter is not 
invoked.

---
Giampaolo Tomassoni - IT Consultant
Piazza VIII Aprile 1948, 4
I-53044 Chiusi (SI) - Italy
Ph: +39-0578-21100

MAI inviare una e-mail a:
NEVER send an e-mail to:
 [EMAIL PROTECTED]


 
 
 Steven Lake
 Owner/Technical Writer
 Raiden's Realm
 www.raiden.net
 A friendly web community