On Sun, Mar 30, 2003 at 12:42:26PM -0500, Theo Van Dinter wrote: > On Sun, Mar 30, 2003 at 11:52:47AM -0500, Chris Devers wrote: > > debug: bayes corpus size: nspam = 1289, nham = 29058 > > wow, that's way a lot of ham.
I was curious so I just checked mine: 0.000 0 749 0 non-token data: nspam 0.000 0 11770 0 non-token data: nham Now, almost 100% of that ham is is autolearned, so I should have a similarly strong weighting toward ham as Chris, right? This makes sense - I'm subscribed to a lot of high traffic mailing lists, so I really do get a lot of legit mail. On the other hand, I'm pretty good about storing a learning all my spam, so it seems it's a problem. Problem is - it *definately* affects accuracy. 2.5 has definately been doing downhill since I installed it, with respect to how much spam gets through. I was originally going to propose that the autolearn not actually autolearn when nham >> nspam, but then it'd be difficult to track changing trends in email. So I don't know what to do. -- Ross Vandegrift [EMAIL PROTECTED] A Pope has a Water Cannon. It is a Water Cannon. He fires Holy-Water from it. It is a Holy-Water Cannon. He Blesses it. It is a Holy Holy-Water Cannon. He Blesses the Hell out of it. It is a Wholly Holy Holy-Water Cannon. He has it pierced. It is a Holey Wholly Holy Holy-Water Cannon. He makes it official. It is a Canon Holey Wholly Holy Holy-Water Cannon. Batman and Robin arrive. He shoots them. ------------------------------------------------------- This SF.net email is sponsored by: The Definitive IT and Networking Event. Be There! NetWorld+Interop Las Vegas 2003 -- Register today! http://ads.sourceforge.net/cgi-bin/redirect.pl?keyn0001en _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk