Am 16.03.2016 um 02:26 schrieb John Hardin:
On Tue, 15 Mar 2016, Ted Mittelstaedt wrote:we have scripts checking any samples against current bayes classification and ignore them if they already have BAYES_99,Is this even necessary? I thought the learner automatically rejected everything already tagged.Already *learned*. There's nothing preventing you from learning messages that scored BAYES_999 (or BAYES_00)
at the begin that's correct, the first half year i also feeded BAYES_999 messages to get common tokens scored higher, somewhere in time it took a direction classify too much ham with BAYES_50
that was a sign of overtraining prevented now, the last 6 months BAYES_00 stays on the same level, BAYES_50 donÄt grow much and most junk is between BAYES_80 and BAYES_999 and high scored
since the corpus / bayes-db is keeped forever at the same time it doesn't grow as fast as the first 6 months
signature.asc
Description: OpenPGP digital signature