Hello Nix, Saturday, July 2, 2005, 2:46:52 AM, you wrote:
N> This is far more elaborate than needed, I think. Limiting the age of N> your spam corpus (which I do anyway) and using mass-check normally will N> do the trick, as mass-check runs through mails in temporal order. The N> only `error' will be that ham of age [now - a couple of years] will N> cohabit in the Bayes DB with spam of age [now - six months]. If this N> caused a problem Bayes would be nearly useless anyway :) Except, doing it this simple way (which is how I do normal, non-bayes mass-checks), means that you'd load (autolearn) a year's worth of ham into your Bayes database before giving it the first spam. Your Bayes database will be out of balance until it has learned a significant number of spam or N> If expiry runs it ditches the ancient email first in any case. until the first significant expiry gets rid of much of that older ham. N> I think I'll do a few local perceptron runs with mass-checks with N> different --limits after the rescoring mass-check is completed, and N> see just what effect varying the limit on ham actually has. I'm N> blithering in the absence of data right now. Good idea. I'm interested to know what you find. Bob Menschel
