Clay Davis wrote:
> If I only ever sa-learn SPAM and never HAM, will the process
> eventually assign all "tokens" a high probability?  Likewise, if I
> have never sa-learn[ed] any HAM will the first message that is
> sa-learn[ed] as SPAM assign a 1.000 to all tokens processed?   

If you only learn spam, Bayes will never activate.  It requires a
certain number of ham messages and a certain number of spam messages
before it will start scoring.

Once you have that minimum number, if you continue to train only spam,
your probabilities will gradually get skewed towards spam and you will
eventually start seeing false positives.  It is normal for bayes to see
many more spam than ham, so I wouldn't worry too much about keeping it
balanced, but you should definitely feed it both spam and ham on a
regular basis.

-- 
Bowie

Reply via email to