On Saturday, January 8, 2011, 6:20:19 PM, Warren Jr. wrote:
> It appears that some of the bb* corpora are extremely old and no
> longer representative of modern mail.  Would anyone object if I went
> ahead and cleaned it up a bit?   Proposed changes below.  Yes, this
> would shrink the ham sample size, but my active masscheck recruiting
> should grow that, and I think we're better off with quality data from
> more recent ham than quantity of old ham.

+1

Old corpora may result in incorrect scores being applied current
messages.

There should be a generalized expiration strategy for the
coropora.

Cheers,

Jeff C.

Reply via email to