Hi all,

It looks like sa-learn is having issues that has reduced SpamAssassin's
hit rate (system wide install) to less than half of what it was catching
a few days ago:

See: http://ry.ca/spam-month.png

Blue represents messages identified as HAM. Green represents SPAM (> 5.0
from SpamAssassin). Scale is messages/minute, and note that, like most
MRTG graphs, the timeline goes from right to left. Note the significant
decrease in the number of identified SPAM messages starting a couple of
days ago. Analysis of messages that have made it through indicate that
some aren't getting BAYES_XX scores at all, and the ones that do are
scoring really low. Using my inbox for an example, I was used to a
90-95% catch rate. Since Saturday, it's around 40%.

More, the problem began when I ran a weekly sa-learn on HAM (n=851) and
SPAM (n=4657) with "sa-learn --spam --dir spam/". This time, though, I
received many many errors identical to the following (wrapped):

Argument "[EMAIL PROTECTED]@" isn't numeric in lt at
/usr/local/lib/perl5/site_perl/5.005/Mail/SpamAssassin/BayesStore.pm
line 1267.

This was present in 2.61, but I upgraded to 2.63 on Sunday, only
to find the same result. OS is FreeBSD 4.9-p4. I've never seen this
problem before.

So, I tried rebuilding (long lines):

ttyp0 [EMAIL PROTECTED]:/var/spool/SpamAssassin #> sa-learn --rebuild
Argument "[EMAIL PROTECTED]@" isn't numeric in lt at 
/usr/local/lib/perl5/site_perl/5.005/Mail/SpamAssassin/BayesStore.pm line 1267.
Argument "[EMAIL PROTECTED]@" isn't numeric in lt at 
/usr/local/lib/perl5/site_perl/5.005/Mail/SpamAssassin/BayesStore.pm line 1267.
Argument "[EMAIL PROTECTED]@" isn't numeric in lt at 
/usr/local/lib/perl5/site_perl/5.005/Mail/SpamAssassin/BayesStore.pm line 1267.
Argument "[EMAIL PROTECTED]@" isn't numeric in lt at 
/usr/local/lib/perl5/site_perl/5.005/Mail/SpamAssassin/BayesStore.pm line 1267.
.
. (many, many lines)
.
synced Bayes databases from journal in 1 seconds: 1971 unique entries (2605 
total entries)

ttyp0 [EMAIL PROTECTED]:/var/spool/SpamAssassin #> dir
total 128224
-rw-------    1 nobody   19873792 Mar 30 22:10 bayes_seen
-rw-------    1 nobody   171704320 Mar 30 22:10 bayes_toks


Help? :-) I'm assuming it's probably related to some bad message in the
samples I fed sa-learn, but a recursive grep for "[EMAIL PROTECTED]@" in the
spam and ham folders I used didn't turn up anything. Where do I look
now?

I still have a backup of Saturday's spam and ham folders, so I have the
opportunity to do some forensics.

Thanks,
- Ryan

-- 
  Ryan Thompson <[EMAIL PROTECTED]>

  SaskNow Technologies - http://www.sasknow.com
  901-1st Avenue North - Saskatoon, SK - S7K 1Y4

        Tel: 306-664-3600   Fax: 306-244-7037   Saskatoon
  Toll-Free: 877-727-5669     (877-SASKNOW)     North America

Reply via email to