https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6155

--- Comment #152 from Mark Martinec <[email protected]> 2009-11-08 16:36:24 
UTC ---
> > A new run, this time I left the URIBL whitelists and similar fixed
> > (at their relatively high manual scores) as they were in current
> > 50_scores.cf

Or to say it better: unlike my previous runs where I commented out most
scores in the existing 50_scores.cf (thus making them mutable, regardless
of a <gen:mutable> markup) except for a couple of exceptions, this time
I did not comment-out scores, and let <gen:mutable> markup do its job.
So this is now more like how it was intended to run GA.

> After a little examination, they look good to me!  +1 to check in.

Thanks. I'm sure we can can still do some manual tweaks and improvements,
but perhaps we can indeed freeze the rest to automatically assigned scores
in this run.

> btw if you feel like cranking up the max gens, go for it.  fwiw,
> spamassassin2.zones has a very powerful CPU -- if it's taking too long
> on your own machine, try scping stuff up and running it there.

My office workstation is quite beefy too, and I hope we won't need to do
many further runs, so for now I'd just stick to what I'm familiar with.
Btw, my set3 run at 14000 iterations takes 5 hours, similar for set1, the
other two are much faster (less than 30 minutes each). I just let it run
overnight, so it wouldn't matter if it takes half that time. I did some
previous runs at 30000 iterations, and a diagram (like the one attached
earlier) does not show noticeable improvements beyond about 10000, or even
small worsening by the end, so the 14000 limit seems reasonable. And the
GA algorithms are said to be prone to overfitting, so it's probably prudent
not to go too far.



> RCVD_IN_XBL is still surprisingly low -- I bet there's some additive
> behaviour overlapping between XBL and PBL, though.
> RCVD_IN_SBL is _very_ low in set 3 too, bizarre!
> otherwise I can't see any issues....

| Please manually adjust the scores of RCVD_IN_PSBL up.  At the time of the
| rescore masscheck PSBL had not yet whitelisted hotmail, yahoo, gmail and a
| number of major ISP's.  As a result, for 5 weeks straight RCVD_IN_PSBL has
| been almost completely devoid of FP's in our weekly masschecks.  I am
| confident that PSBL performs safer than measured during the rescore masscheck

Ok, I suggest we collect some manual fixes like the ones suggested here
(with specific score suggestions), and wrap it up.

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to