On Sun, 21 Mar 2004 16:32:11 -0300, Gustavo Michels
<[EMAIL PROTECTED]> wrote:
>On Sunday 21 March 2004 15:49, Alan Baxter wrote:
>
>> It looks like you're using a database that you got from somewhere else
>> instead of one that's based on the spam and ham that you've seen since
>> you started using SA.  That's not bad per se.  Is bayes working
>> effectively for you?  It won't expire any tokens until you've accessed
>> at least 112500, and I think it might be several weeks before you reach
>> that number as a single user.
>
>Well, I use Gentoo and all I did was emerge the package. Unless there's a 
>sample database with gentoo's package (which btw I don't think there is), I 
>am sure this database was created by me.
>
>In the first weeks of use, I used sa-learn --spam everyday since about 40% of 
>my spam wasn't being flagged. Nowadays, the accuracy I have is extremely 
>good, I rarely get false positives or spam not being flagged. I must say I am 
>happy with SA so far, however I started having this 5 minutes hang problem 
>(since I use it as a pipe thru filter action in KMail, it would get my mail 
>client to hang for 5 minutes everyday) and this wasn't happening in the first 
>weeks of use.

That seems about right.  It takes about 90 seconds to expire my database
when it reaches 150,000.

It sure looked weird to me that you had a database with over 800,000
tokens.  It should have been trying to auto expire every 12 hours ever
since it reached 150,000.  On top of that it showed that only 21,000 or
so had been accessed within the past two months.  It doesn't look like
you've received much email during that time frame.

>> Your bayes might be more effective if you eliminate all of those unused
>> tokens.  You ought to be able to force it to remove all of the tokens
>> you haven't used by putting "bayes_expiry_max_db_size 28000" in your
>> user_prefs.  This should cause it to purge all of the tokens that you
>> haven't used the next time an expiry is attempted.  Once you've done
>> that you can remove the bayes_expiry_max_db_size line and bayes will
>> grow using only the tokens that you learn from your email.  I have a
>> single user installation too, so I don't need auto expiration.  I just
>> do a manual expire once every month or so.
>
>Well first I tried feeding the database as Theo Van Dinter suggested. Then I 
>rerun the force expire and indeed it worked:
> ...
>
>Then I tried your "bayes_expiry_max_db_size 28000" suggestion, but there seems 
>to have a lower limit for the db size:
>
>debug: bayes: expiry check keep size, 75% of max: 21000
>debug: bayes: expiry keep size too small, resetting to 100,000 tokens

Thanks for doing the experiment. :-)  Rereading the sa-learn man page, I
see that the minimum keep size is 100,000.

>So I guess I will leave it at 100,000, not use auto-expire and do it like you 
>do, once a month. My main problem was KMail being hung for 5 minutes, and 
>that's not going to happen anymore.

Cool.  Glad SA is working so well for you.

Alan
-- 
Please avoid sending me Word or PowerPoint attachments.
See http://www.fsf.org/philosophy/no-word-attachments.html

Reply via email to