On 02/02/2010 15:21, Kārlis Repsons wrote:

By the way, I feel interested in scores. For example, I've set up an automatic
sorting, which divides spam into three categories: gray, certain, heavy. I was
looking at that STATISTICS.txt and my first impression about boundaries was:
{4, 6.6, 8}, 4 being the first valid "spam score". However, currently I have
{3, 4, 8}, which might be too drastic for later use, just now when I train
filters and receive no letters, which could easily go as spam. But what would
be your three scalar combination (given, you only wish to personally check
"grey" folder)?

Decide how much spam you're willing to manually check, then parse your logs to determine the largest threshold you could set which wouldn't exceed that volume.

--
Mike Cardwell    : UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/       #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser       : Spam Tool  - http://spamalyser.com/

Reply via email to