On 09/04/2012 06:26 PM, [email protected] wrote:
On 09/04, Axb wrote:
Unless it's kept up to date, I suggest we remove jm's corpus data.

I share your concern.  I'd say the problem is less jm's corpus, and more
that we use ham up to 6 *years* old (in comparison to 2 months for spam,
related to bug 6557).

But jm's corpus is over a quarter of our ham corpora.  On the day you
linked to, if we remove it, we're down to 167,229 hams.  Still over
threshold (150,000), but I'd like more before we change this.

And I'd suggest the change to make is reducing the maximum age of ham.


for a starter, I just removed all my 2008/2009 ham


Reply via email to