On Sun, 21 Mar 2004 14:55:44 -0300, Gustavo Michels
<[EMAIL PROTECTED]> wrote:
>On Sunday 21 March 2004 14:14, Alan Baxter wrote:
>
>> SA is *not* waiting or hanging for five minutes.  It's busy analyzing
>> your bayes database to find an appropriate expiry time.  Your bayes
>> database is about six times "bayes_journal_max_size" and doesn't look
>> like it has been used much in the last 64 days.  As a result, SA can't
>> find an appropriate expire time and attempts to automatically expire the
>> database again at a later time.
>>
>> I suggest you turn off auto expiry by adding "bayes_auto_expire 0" to
>> your user_prefs file and attempt a manual expiry every few days with
>> "sa-learn --force-expire".
>
>Ok, thanks for the clarification, I added the option to my user_prefs so it 
>won't be doing that every boot.
>
>I also tried the manual expire with similar results to the log I sent before. 
>Is there something wrong with my bayes database? Shouldn't that expire finish 
>with a better result than "debug: bayes: couldn't find a good delta atime, 
>need more token difference, skipping expire"?

No, it shouldn't.  There's a good discussion of bayes expiry in the man
page for sa-learn that I kept in mind when studying your log file.  The
debug information "couldn't find a good delta atime, need more token
difference" means that it didn't have enough information to determine
which tokens should be removed from the database.  Of the 887218 tokens
in there, only 21636 have been used in the past 64 days.  I suppose it
might be reasonable for it to remove all but the 21636 tokens that have
been used, but expire will not remove any tokens if that would cause the
resulting database to have less than 112500.  (Unless you reduce
bayes_expiry_max_db_size.  See below.)

It looks like you're using a database that you got from somewhere else
instead of one that's based on the spam and ham that you've seen since
you started using SA.  That's not bad per se.  Is bayes working
effectively for you?  It won't expire any tokens until you've accessed
at least 112500, and I think it might be several weeks before you reach
that number as a single user.

Your bayes might be more effective if you eliminate all of those unused
tokens.  You ought to be able to force it to remove all of the tokens
you haven't used by putting "bayes_expiry_max_db_size 28000" in your
user_prefs.  This should cause it to purge all of the tokens that you
haven't used the next time an expiry is attempted.  Once you've done
that you can remove the bayes_expiry_max_db_size line and bayes will
grow using only the tokens that you learn from your email.  I have a
single user installation too, so I don't need auto expiration.  I just
do a manual expire once every month or so.

Hope this helps,
Alan
P.S.  Let me know how well this works if you try it.

-- 
Please avoid sending me Word or PowerPoint attachments.
See http://www.fsf.org/philosophy/no-word-attachments.html

Reply via email to