One feature that may be useful would be to let ASSP automatically scale 
how much spam/nonspam it collects, based on a couple of factors. It is 
often too easy to let the bayesian database get skewed to one side 
(usually heavy on the spam side) due to imbalanced collecting (such as 
1:1, with a 80% spam rate).

Perhaps ASSP could look at the rebuildrun.txt, see the value of the 
weighted norm then decide if it needs to adjust the collecting in one 
direction or another. Then it would also look at the Non-Local Mail 
Blocked (or another spam ratio indicator) to see how far it needs to 
skew the collecting (1:2, 1:4, I have close to 90% spam so I've been 
using 1:10 to get my corpus norm down from 3.5)

For cases like mine where the corpus was heavily skewed, it would need 
to push the ratio even further (1:15, 1:20) then level out once the norm 
nears 1.0

any thoughts?

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Assp-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/assp-user

Reply via email to