This is the output from sa-learn -D --force-expire.  It seems that
Theo's guess is correct according to the error toward the end.  I guess
the next question is what harm is there in leaving this until 3.0?  I do
not have a set of spam to feed the Bayes system anymore.  I'm not quite
sure how inaccurate SA will be if I start fresh.  Any suggestions?

Thanks for the help and I am running 2.63.

debug: Score set 0 chosen.
debug: running in taint mode? yes
debug: Running in taint mode, removing unsafe env vars, and resetting
PATH
debug: PATH included '/usr/local/sbin', keeping.
debug: PATH included '/usr/local/bin', keeping.
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/usr/bin', keeping.
debug: PATH included '/sbin', keeping.
debug: PATH included '/bin', keeping.
debug: PATH included '/usr/bin/X11', which doesn't exist, dropping.
debug: Final PATH set to:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
debug: using "/usr/share/spamassassin" for default rules dir
debug: using "/etc/spamassassin" for site rules dir
debug: using "/root/.spamassassin/user_prefs" for user prefs file
debug: bayes: 27381 tie-ing to DB file R/O /etc/spamassassin/bayes_toks
debug: bayes: 27381 tie-ing to DB file R/O /etc/spamassassin/bayes_seen
debug: bayes: found bayes db version 2
debug: Score set 2 chosen.
debug: Initialising learner
debug: Initialising learner
debug: Syncing Bayes journal and expiring old tokens...
debug: lock: 27381 created
/etc/spamassassin/bayes.lock.gateway2.oc.edu.27381
debug: lock: 27381 trying to get lock on /etc/spamassassin/bayes with 0
retries
debug: lock: 27381 link to /etc/spamassassin/bayes.lock: link ok
debug: bayes: 27381 tie-ing to DB file R/W /etc/spamassassin/bayes_toks
debug: bayes: 27381 tie-ing to DB file R/W /etc/spamassassin/bayes_seen
debug: bayes: found bayes db version 2
..
debug: bayes: expiry check keep size, 75% of max: 225000
debug: bayes: token count: 2331423, final goal reduction size: 2106423
debug: bayes: First pass?  Current: 1085172306, Last: 1085163864, atime:
172800, count: 40459, newdelta: 3319, ratio: 52.0631503497368
debug: bayes: Can't use estimation method for expiry, something fishy,
calculating optimal atime delta (first pass)
debug: bayes: atime     token reduction
debug: bayes: ========  ===============
debug: bayes: 43200     2330836
debug: bayes: 86400     2330836
debug: bayes: 172800    2330836
debug: bayes: 345600    2330836
debug: bayes: 691200    2330836
debug: bayes: 1382400   2330836
debug: bayes: 2764800   2330836
debug: bayes: 5529600   2330836
debug: bayes: 11059200  2330836
debug: bayes: 22118400  2330836
debug: bayes: couldn't find a good delta atime, need more token
difference, skipping expire.
debug: Syncing complete.
debug: bayes: 27381 untie-ing
debug: bayes: 27381 untie-ing db_toks
debug: bayes: 27381 untie-ing db_seen
debug: bayes: files locked, now unlocking lock
debug: unlock: 27381 unlink /etc/spamassassin/bayes.lock

-----Original Message-----
From: Matt Kettler [mailto:[EMAIL PROTECTED] 
Sent: Friday, May 21, 2004 2:43 PM
To: Kristopher Austin; [EMAIL PROTECTED]
Subject: Re: Bayes DB possible problem

At 02:35 PM 5/21/2004, Kristopher Austin wrote:
>I was looking at my Bayes DB files and noticed that they seem very
>large.  Is this a problem?

In your case, yes.


>54K May 21 13:28 bayes_journal
>82M May 21 13:28 bayes_seen
>80M May 21 13:28 bayes_toks
>
>I went ahead and did a sa-learn --dump magic and this is the output:
>
>0.000          0          2          0  non-token data: bayes db
version
>0.000          0      70627          0  non-token data: nspam
>0.000          0      29182          0  non-token data: nham
>0.000          0    2041152          0  non-token data: ntokens
>0.000          0  956386256          0  non-token data: oldest atime
>0.000          0 2093049063          0  non-token data: newest atime
>0.000          0 1085163866          0  non-token data: last journal

<snip>

>Does it seem unusual to have 2 million tokens in the database?


Yes, it also seems strange for the "newest atime" to be so high relative
to 
oldest and last journal times.

What version of SA are you on? I've had problems with strange atimes on
SA 
2.5x, but I've been free of them ever since I upgraded to 2.63.

Try doing an expire with debug output on:

sa-learn -D --force-expire

Maybe the debug output can offer some clues.




Reply via email to