Jim Maul wrote:
Ronan wrote:
hi all.
for those of you running large volume servers you no doubt have an abundance of spam to feed into sa-learn, and i suppose that goes for all sizes of volumes.
but one question. how do you manage to match the same number with hams / real messages. how do you go about bumping up the numbers to even the DB? Am i right in saying that basically anymail thats not spam is ham or is ham only supposed to be mail that are false negatives ie have been tagged but arent really spam.
Attempting to get these numbers equal is an unncessary, and as you've discovered, almost futile task.
While i would *not* recommend running on autolearning exclusively, it is working incredibly well here with the occasional manual sa-learn here and there. sa-learn --dump magic shows the following for my system:
0.000 0 1105 0 non-token data: nspam 0.000 0 28077 0 non-token data: nham
Jim, isnt your ration of ham:spam 25:1 and not 1:25
Oops, yep your correct, i had the order switched. Regardless, my point still stands :)
-Jim