On Thu, Jan 15, 2004 at 08:22:37AM +1300, Sidney Markowitz wrote:
> Duncan Findlay wrote:
> >I'm not really sure why these mails seem to not get hit
> 
> I started running 2.61 on my real mail feed just about a week ago, 
> initially training Bayes on the most recent 2000 spam and about 500 ham 
> that I had received. I also have been learning every piece of mail that 
> comes in as either spam or ham. Since then it has scored BAYES_99 on 
> every one of the "poisoned" spam with random words after the /HTML tag.
> 
> Could there be something about the way you train or use Bayes that is 
> causing it to miss on these "poisoned" spam? Why would we see such 
> different results?

Hmmm... I just checked my Bayes DB and noticed something very very strange.

0.000          0          2          0  non-token data: bayes db version
0.000          0        172          0  non-token data: nspam
0.000          0        407          0  non-token data: nham
0.000          0      21880          0  non-token data: ntokens
0.000          0 1073413315          0  non-token data: oldest atime
0.000          0 1074099954          0  non-token data: newest atime
0.000          0 1073413403          0  non-token data: last journal sync atime
0.000          0 1073413403          0  non-token data: last expiry atime
0.000          0          0          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire reduction 
count
0.500          0          0          0  Fog
0.500          0          0          0  anti
0.500          0          0          0  largest
0.500          0          0          0  ile
0.500          0          0          0  requiring
0.500          0          0          0  HX-Spam-Checker:sk:spamass
0.500          0          0          0  Via-gr-a
0.500          0          0          0  spending
0.500          0          0          0  T.V
0.500          0          0          0  System!
0.500          0          0          0  init.d
....

So essentially, I have no Bayes DB. I don't know why the counts are
0... that certainly shouldn't happen, should it? Also, I don't know
what happened to my cron jobs -- I definitely should have more mail in
my DB.

I wonder if my DB is corrupt for some reason. I sometimes worry about
this when I downgrade spamassassin every once in a while for testing,
etc.

Anyone seen this before?

-- 
Duncan Findlay

Attachment: signature.asc
Description: Digital signature

Reply via email to