RE: [Spambayes] Re: Training oddity/confusion

Tony Meyer Thu, 13 Jan 2005 12:22:20 -0800

> >With 'classic' train to exhaustion, the database is kept exactly 
> >balanced, I believe.  How well is your system working for you?
> 
> Erm, not all that well. :|


:(  I'm trying to get things rearranged a little for 1.1 so that it's easier
to try out different training regimes (including tte) with the various apps,
so hopefully that'll help.

> My incoming mail is very unbalanced - 17:1 spam:ham since I 
> started the training - which can't help, but so far I have 
> 18% unsure spam and 3% false negatives. No mistakes on ham 
> though; none scored higher than 0.5%. Given that, I suppose I 
> could simply mess with the thresholds.

I've read reports of people who have done that (in an extreme way, so that
the cutoffs are 5% and 10% or something like that).  It seems pretty risky
to me, though, since a message that contains nothing that has been seen
before will score 0.5 and that would be same under that system...

=Tony.Meyer

-- 
Please always include the list ([email protected]) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.

_______________________________________________
[email protected]
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

RE: [Spambayes] Re: Training oddity/confusion

Reply via email to