Mark Hammond wrote:
That is high relative to the conventional wisdom, but I'm questioning
the correctness of that wisdom.
    

Check out this thread, which should give you a reasonable idea:

http://mail.python.org/pipermail/spambayes-dev/2003-November/001578.html

  
Perhaps its time to re-evaluate that statement?
    

Google also shows anecdotal reports of poor results after an imbalance as
low as 2:1, so I don't think it would be responsible to re-evaluate that
statement until clear evidence was presented to the contrary.
  
I don't get a lot of ham, and currently have 55 ham and 580 spam in my Spambayes database.  Despite this, it seems to be working admirably.  It is however very sensitive to just one spam mistakenly put into the ham base, which then completely upsets the filtering.

So if the perceived wisdom is that I need to balance the ratios, what should I do?  send myself ham? or not use spam from my unsure folder for training? or get more friends???

regards,
Mike
_______________________________________________
SpamBayes@python.org
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

Reply via email to