http://bugzilla.spamassassin.org/show_bug.cgi?id=3225





------- Additional Comments From [EMAIL PROTECTED]  2004-04-02 17:20 -------
Subject: Re:  RFE: Bayes optimizations

On Sun, Mar 28, 2004 at 05:37:14PM -0800, [EMAIL PROTECTED] wrote:
> 
> ------- Additional Comments From [EMAIL PROTECTED]  2004-03-28 17:37 -------
> Note that the tok_get_all in the patch queries all of the tokens extracted 
> from
> a message at once without checking if the resulting SELECT is too large for
> MySQL. I did not test with Michael Parker's suggestion of querying 25 at a 
> time.
> 

I'm currently finishing up a fairly large hunk of storage (sql and
dbm) optimizations/changes.  One is Sidney's tok_get_all with
batching. Another is moving token_count and the newest/oldest token
age values into the bayes_vars table (to avoid some table scans).  It
implements a cache to avoid having to go to the database for multiple
items. Removing some dead code and general cleanup.

They span a fairly large range, and will require a schema change.  I
hope to have everything finished up this weekend, assuming I don't get
pulled away for something else.

One things of note, I've got what I consider a fairly decent benchmark
script now that I've been using for my testing. Hopefully I can
package it up for others to use as well.  It makes comparing changes
and different storage backends very easy.

Michael





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to