>> If so, then this should happen - if you train the message twice,
>> then all the tokens for the message will be incremented twice and
>> the total count should be incremented twice.
> 
> If so then how can one tell how many *distinct* massage are 
> actually trained? It may be a little confusing if people try 
> to use this information to follow the recommendation of 
> "number of ham and spam of equal order".

SpamBayes doesn't care whether the ham or spam you train on are distinct
or not.  It's the total number of messages, not distinct messages, that
counts.  If you train on 500 copies of the same 2 ham and spam messages,
then the math will work fine (but of course, it'll only be any good at
recognising those two messages).

=Tony.Meyer

-- 
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this. 
_______________________________________________
[email protected]
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

Reply via email to