On 09/04/2012 06:26 PM, [email protected] wrote:
On 09/04, Axb wrote:
Unless it's kept up to date, I suggest we remove jm's corpus data.

I share your concern.  I'd say the problem is less jm's corpus, and more
that we use ham up to 6 *years* old (in comparison to 2 months for spam,
related to bug 6557).

+1 imo, HAM should not be older than 24 months.

But jm's corpus is over a quarter of our ham corpora.  On the day you
linked to, if we remove it, we're down to 167,229 hams.  Still over
threshold (150,000), but I'd like more before we change this.

but if ancient ham prevents good spam rules to score higher.....

And I'd suggest the change to make is reducing the maximum age of ham.

agreed.


Reply via email to