Tonjg wrote on Thu, 18 Mar 2010 05:17:21 -0700 (PDT): > I hope this command gives the correct answer... > # sa-learn --dump magic > 0.000 0 3 0 non-token data: bayes db version > 0.000 0 514 0 non-token data: nspam > 0.000 0 402 0 non-token data: nham > 0.000 0 54301 0 non-token data: ntokens
Yes, that's not very much. I don't have such "tiny" dbs, so I don't know how well this works. As a gut feeling I'd say you want at least 200.000 tokens. Fresh tokens, not tokens from last year's spam and ham. My systems usually have at least half a million tokens. I've been running with 2 million with no problems. Kai -- Get your web at Conactive Internet Services: http://www.conactive.com