Am 16.03.2016 um 02:26 schrieb John Hardin:
On Tue, 15 Mar 2016, Ted Mittelstaedt wrote:

 we have scripts checking any samples against current bayes
 classification and ignore them if they already have BAYES_99,

Is this even necessary?  I thought the learner automatically
rejected everything already tagged.

Already *learned*. There's nothing preventing you from learning messages
that scored BAYES_999 (or BAYES_00)

at the begin that's correct, the first half year i also feeded BAYES_999 messages to get common tokens scored higher, somewhere in time it took a direction classify too much ham with BAYES_50

that was a sign of overtraining prevented now, the last 6 months BAYES_00 stays on the same level, BAYES_50 donÄt grow much and most junk is between BAYES_80 and BAYES_999 and high scored

since the corpus / bayes-db is keeped forever at the same time it doesn't grow as fast as the first 6 months

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to