From: Kevin Golding <k...@caomhin.org> >On Thu, 01 Jun 2017 03:52:42 +0100, David Jones <djo...@ena.com> >wrote:
>> I am working pretty hard to get the ruleqa processing going again on our >> new server. We are so close to having enough contributors and ham/spam >> to get some new rules generated: This is from the run minutes ago: >> >> HAM CONTRIBUTORS FOUND: 9 (required 10) >> SPAM CONTRIBUTORS FOUND: 9 (required 10) >> >> We need to recruit some more masscheck'ers to get over the hump so I can >> do some final testing of the rules updates and start the DNS updates >> again for sa-update. >I disabled my masscheck after a while because... well, it was pointless. >It's passed the cron window for the day but if you need the data I can run >it manually, else it'll kick in again tomorrow. Why do you think it was pointless? This does a couple of things: 1. It provides needed feedback to rules before they can be published to the Internet via sa-update. 2. It adjusts the 72_scores.cf based on recent spam/ham which benefits everyone using spamassassin all over the world that runs sa-update regularly. >> P.S. After spending the past month learning how this works, I have some >> ideas on how to make the nightly masschecks become hourly fairly easily >> so we can test and promote rule changes faster. >You may need to explain the requirements for that. Are you asking for >hourly masscheck submissions? Today the delay of up to 24 hours is pretty slow to provide feedback or score updates. I don't think this will ever be intended to update quickly enough to help with zero-hour spam or replace technologies that react quickly like RBLs, DCC, Pyzor, etc. I have an idea that will allow masscheckers to cron the automasscheck.sh script hourly which would only run the full masscheck when they detect a new tagged ruleset to work with. Basically it would do a quick rsync of the latest tagged build dir like it does today but if there are no rsync changes, it would simply exit. Everyone would still keep sorting ham/spam as they do today so there would be no real change in that. Hopefully everyone is sorting at least every other day or every third day. I try to sort some every day since I also have this tied to local Bayes training to make this work a little more worth the time and effort. Dave