Clay Davis wrote: > If I only ever sa-learn SPAM and never HAM, will the process > eventually assign all "tokens" a high probability? Likewise, if I > have never sa-learn[ed] any HAM will the first message that is > sa-learn[ed] as SPAM assign a 1.000 to all tokens processed?
If you only learn spam, Bayes will never activate. It requires a certain number of ham messages and a certain number of spam messages before it will start scoring. Once you have that minimum number, if you continue to train only spam, your probabilities will gradually get skewed towards spam and you will eventually start seeing false positives. It is normal for bayes to see many more spam than ham, so I wouldn't worry too much about keeping it balanced, but you should definitely feed it both spam and ham on a regular basis. -- Bowie