Michael Parker writes: > Justin Mason wrote: > > Michael Parker writes: > >> Apache Wiki wrote: > >>> > >>> == Using network tests == > >>> > >>> - For mass-checks for scoresets 1 or 3, using network tests, you need to > >>> provide the {{{--net}}} switch. Ensure Net::DNS, Mail::SPF::Query, > >>> Razor, Pyzor and DCC are installed. > >>> + For mass-checks for scoresets 1 or 3, using network tests, you need to > >>> provide the {{{--net}}} switch. Ensure Net::DNS, Mail::SPF::Query, Razor > >>> (InstallingRazor), Pyzor (InstallingPyzor) and DCC (["InstallingDCC"]) > >>> are installed. > >>> > >> Razor/DCC/Pyzor are all use rules. So there is no need to install those > > Erp, of course I meant to say "all reuse rules". > > >> for a mass-check, unless you have lots of msgs that were not checked > >> initially. We should only be using the historical data for these rules > >> anyway, so probably best to not install them even if you do have a lot > >> of messages that were not checked initially. > > > > That would be tricky; I've turned all of those off to reduce CPU time, and > > (of course) my spamtraps have never scanned the messages using them. > > Then that greatly skews the usefullness of those rules, or any ##reuse > rules really. The whole point of reuse is to get an accurate accounting > of how those rules behave in realtime, not how well they do 6 months later.
Maybe we should do a census of what sets of optional #reuse rules have been enabled by each mass-check submitter; we may then be able to (a) compute reliable hit-frequencies by discarding logs from people who aren't running them (b) estimate which sets can be optimised with the Perceptron, and which will have to be judged manually (because of insufficient data) --j.