updateDNS.sh on sa-vm1.apache.org - DNS updates disabled
3.3.3.updates (TXT) -> \"1814048\" File /usr/local/bin/updateDNS.disabled exists, not updating DNS.
Re: Eureka: truncation of 72_active.cf
> That is an interesting find. I am not surprised though. It's very > possible that the rules promotion processing script needs to be looked at > too based on your finding. The time I spent learning how the scripts run > back in May and June showed several deficiencies that need to be addressed > once we get the current 72_scores.cf issue resolved, hopefully soon. > > > Today if you commit a ruleset change during most hours of the day, it will > throw off the the current masscheck processing making it invalid. The I have not looked at this part of the process yet, but I would be interested in what the effects are when this happens and what breaks. > masscheck rsync area will then have to be updated the next day and then > another day for the masscheckers to validate rules and build the ruleset. > There are only a few hours in the day that commits can happen that won't > cause this problem. Their should be an easy way to lock in the tagged > revisions correctly, linked to the masscheck rsync revision so they keep > building rulesets every day. I see the backend/nitemc/svn_checkout script just checks out trunk without any revision of tag: svn co http://svn.apache.org/repos/asf/spamassassin/trunk It is called by: backend/nitemc/run_all which I assume kicks off the nightly masscheck. But then masses/rule-qa/corpus-nightly updates to the latest revision from nightly-versions.txt for each corpus? And probably using the same working copy? But I don't know yet whats actually happening in that working copy, the whole corpus stuff is till in the realm of magic for me at the moment ;) > > > From: Merijn van den Kroonenberg> Sent: Thursday, November 2, 2017 4:29 AM > To: David Jones; Kevin A. McGrail > Subject: Re: Eureka: truncation of 72_active.cf > > I am a bit confused about corpus revision > > Take for example this: > > Revision: 1813664 > Author: spamassassin_role > Date: zondag 29 oktober 2017 3:47:00 > Message: > updated scores for revision 1813595 active rules added since last > mass-check > > And in the sysadmin mail from last night (rescore example) > > svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk > trunk-new-rules-set1 > > So the commit mentions revision 1813595, I assume its also mentioned in > corpus logs and its actually checked out. > > BUT 1813595 is no valid revision for the spamassassin project?? > > Its actually a revision in another apcache project. > > Revision: 1813595 > Author: deepak > Date: zaterdag 28 oktober 2017 10:33:50 > Message: > Improved: Add rat exclude files to excludes those files that does not need > license header > (OFBIZ-9856) > > Updated rat-excludes.txt file > > Modified : /ofbiz/ofbiz-framework/trunk/rat-excludes.txt > > So is somewhere something wrong with detecting the correct revision? > > > -Original Message- > From: Kevin A. McGrail > Sent: Wednesday, November 01, 2017 10:49 PM > To: David Jones ; Merijn van den Kroonenberg > Subject: Re: Eureka: truncation of 72_active.cf > > On 11/1/2017 5:31 PM, David Jones wrote: >> >> I found another bug in the DKIM_VALID_EF rule description that needed to >> be wrapped in a version check that was causing the ruleset validation to >> fail with return code 4. Just committed another fix that should take >> care >> of this. >> >> >> The timing of the rule promotions and the masscheck validation, this >> could >> take another ~40 hours to work itself out. I really want to improve the >> way things work to speed up this cycle time. >> > Agreed. I am very excited to finally have this done though. > >
Re: Eureka: truncation of 72_active.cf
Well crap. I found another odd dependency that throws off the masscheck processing thrown off when commits are done outside the few hour window: https://wiki.apache.org/spamassassin/InfraNotes2017#nitemc automc@sa-vm1:~$ ~/svn/trunk/build/mkupdates/run_nightly + promote_active_rules + pwd /usr/local/spamassassin/automc/svn/trunk + /usr/bin/perl build/mkupdates/listpromotable HTTP get: http://ruleqa.spamassassin.org/1-days-ago?xml=1 no 'mcviewing', 'mcsubmitters' microformats on day 1 URL: http://ruleqa.spamassassin.org/1-days-ago?xml=1 + exit 25 This is a critical cron job that sets up the new rule promotions daily. I will dig into this one deeper but it seems that if the masscheck SA revision gets out of sync with new commits that may not even be ruleset-related, then days have to pass with no commits before the stars will align properly again. Geez. What a mess with this workflow! I think we need to carefully document the current SVN workflow and redesign it to handle this better. The masscheck processing of rulesets only needs to be tied to the ruleset revision staged in the rsync area for the current 24 hour period. Maybe we need a new masscheck-specific tag separate from the rule promotion tags today? Rule promotions can happen at any time, multiple times a day as long as they pass a lint check. Masscheck validation are currently only done once a day. My intial thoughts are parts of the scripts are using the latest SVN revision and part are using the latest tagged revision from rule promotions. When these get out of sync, we don't get enough masschecking of the proper revision to keep moving everything forward. Dave On 11/02/2017 07:46 AM, Kevin A. McGrail wrote: On 11/2/2017 8:40 AM, Merijn van den Kroonenberg wrote: The checkout works as you would expect. But its just very confusing a revision is used which is not inside the spamassassin project. It might also cause side effects in other part of the process as David mentioned. But the check out part is not actually broken. I do have some considerable experience with subversionthe problem is more what the intention of the code should be ;) My recommendation is do not try and unravel the thought process behind the code. Stay focused on the goal which is to produce rules and distribute them. Anything you do towards that goal is good. If we break some eggs to make some omelets, great. Ideally, I would like to publish more daily rulesets, focus on optimization, etc. Regards, KAM
Re: mailspike in 72_scores
On 11/02/2017 08:11 AM, Kevin A. McGrail wrote: On 11/2/2017 6:44 AM, Merijn van den Kroonenberg wrote: While debugging the score generation problem, I encountered another weird case in 72_scores.cf. It is about the mailspike rules which are in 50_scores so should not be in 72_scores. There was already an existing issue about this: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6400 I found the cause of this and attached a patch to bugzilla. but since its already a very old issue I wonder if someone will ever look it it anymore ;) Thanks! I did have a look at it but I don't want to interfere with David's work. David, could you make the commit for him on this bug? I certainly will. I would like to wait a couple of days if that is OK since committing now could throw off the masscheck validations that we need to get sa-updates started again. Dave Regards, KAM
Optimize score-ranges-from-freqs (needless run of parse-rules-for-masses)
Hi, This is about /masses and intended as a possible future todo, so i'll just drop it in this list. The makefile contains: tmp/rules_${SCORESET}.pl: tmp/.created ../build/parse-rules-for-masses perl ../build/parse-rules-for-masses -d $(RULES) -s $(SCORESET) \ -o tmp/rules_${SCORESET}.pl Which generates for example a rules_0.pl. In the same makefile is: tmp/ranges.data: tmp/.created freqs score-ranges-from-freqs perl add-hitless-active-to-freqs perl score-ranges-from-freqs $(RULES) $(SCORESET) < freqs perl lock-scores 1 mv tmp/ranges.data-new tmp/ranges.data As you can see it runs score-ranges-from-freqs. In score-ranges-from-freqs is this code: my $tmpf = "tmp/rules$$.pl"; system "../build/parse-rules-for-masses ". "-d \"$argcffile\" -s $scoreset -o $tmpf" and die; require $tmpf; unlink $tmpf; Which is doing the same thing. So I would say change the makefile to add dependency tmp/rules_${SCORESET}.pl to tmp/ranges.data. And then use the tmp/rules_${SCORESET}.pl directly from score-ranges-from-freqs. And to be consistent also add dependencies add-hitless-active-to-freqs lock-scores to tmp/ranges.data as they are all needed. Cheers, Merijn
Re: mailspike in 72_scores
On 11/2/2017 6:44 AM, Merijn van den Kroonenberg wrote: While debugging the score generation problem, I encountered another weird case in 72_scores.cf. It is about the mailspike rules which are in 50_scores so should not be in 72_scores. There was already an existing issue about this: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6400 I found the cause of this and attached a patch to bugzilla. but since its already a very old issue I wonder if someone will ever look it it anymore ;) Thanks! I did have a look at it but I don't want to interfere with David's work. David, could you make the commit for him on this bug? Regards, KAM
Re: Eureka: truncation of 72_active.cf
The checkout works as you would expect. But its just very confusing a revision is used which is not inside the spamassassin project. It might also cause side effects in other part of the process as David mentioned. But the check out part is not actually broken. I do have some considerable experience with subversionthe problem is more what the intention of the code should be ;) -Original Message- From: Kevin A. McGrail Sent: Thursday, November 02, 2017 1:27 PM To: Merijn van den Kroonenberg ; David Jones ; sysadmins@spamassassin.apache.org Subject: Re: Eureka: truncation of 72_active.cf +sysadmins@s.a.o so we don't lose these conversations. Consider asking gst...@apache.org for his help on SVN. I simply don't play with enough tags, branches, revisions, etc. Either there is a bug and we are injecting the wrong version, or you are just looking at things incorrectly. However, I think you perhaps are just doing something wrong?? This command, for example, appears to pull an SA rule set. Isn't that the expected behavior? svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk trunk-new-rules-set1 | more Atrunk-new-rules-set1/rulesrc Atrunk-new-rules-set1/rulesrc/10_force_active.cf Atrunk-new-rules-set1/rulesrc/sandbox Atrunk-new-rules-set1/rulesrc/sandbox/jhardin A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_no_text.cf A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_misc_testing.cf A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_lotsa_money.cf A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_in_body.cf What do you get? Regards, KAM On 11/2/2017 5:29 AM, Merijn van den Kroonenberg wrote: I am a bit confused about corpus revision Take for example this: Revision: 1813664 Author: spamassassin_role Date: zondag 29 oktober 2017 3:47:00 Message: updated scores for revision 1813595 active rules added since last mass-check And in the sysadmin mail from last night (rescore example) svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk trunk-new-rules-set1 So the commit mentions revision 1813595, I assume its also mentioned in corpus logs and its actually checked out. BUT 1813595 is no valid revision for the spamassassin project?? Its actually a revision in another apcache project. Revision: 1813595 Author: deepak Date: zaterdag 28 oktober 2017 10:33:50 Message: Improved: Add rat exclude files to excludes those files that does not need license header (OFBIZ-9856) Updated rat-excludes.txt file Modified : /ofbiz/ofbiz-framework/trunk/rat-excludes.txt So is somewhere something wrong with detecting the correct revision? -Original Message- From: Kevin A. McGrail Sent: Wednesday, November 01, 2017 10:49 PM To: David Jones ; Merijn van den Kroonenberg Subject: Re: Eureka: truncation of 72_active.cf On 11/1/2017 5:31 PM, David Jones wrote: I found another bug in the DKIM_VALID_EF rule description that needed to be wrapped in a version check that was causing the ruleset validation to fail with return code 4. Just committed another fix that should take care of this. The timing of the rule promotions and the masscheck validation, this could take another ~40 hours to work itself out. I really want to improve the way things work to speed up this cycle time. Agreed. I am very excited to finally have this done though.
Re: Eureka: truncation of 72_active.cf
On 11/2/2017 8:40 AM, Merijn van den Kroonenberg wrote: The checkout works as you would expect. But its just very confusing a revision is used which is not inside the spamassassin project. It might also cause side effects in other part of the process as David mentioned. But the check out part is not actually broken. I do have some considerable experience with subversionthe problem is more what the intention of the code should be ;) My recommendation is do not try and unravel the thought process behind the code. Stay focused on the goal which is to produce rules and distribute them. Anything you do towards that goal is good. If we break some eggs to make some omelets, great. Ideally, I would like to publish more daily rulesets, focus on optimization, etc. Regards, KAM
Re: Eureka: truncation of 72_active.cf
+sysadmins@s.a.o so we don't lose these conversations. Consider asking gst...@apache.org for his help on SVN. I simply don't play with enough tags, branches, revisions, etc. Either there is a bug and we are injecting the wrong version, or you are just looking at things incorrectly. However, I think you perhaps are just doing something wrong?? This command, for example, appears to pull an SA rule set. Isn't that the expected behavior? svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk trunk-new-rules-set1 | more A trunk-new-rules-set1/rulesrc A trunk-new-rules-set1/rulesrc/10_force_active.cf A trunk-new-rules-set1/rulesrc/sandbox A trunk-new-rules-set1/rulesrc/sandbox/jhardin A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_no_text.cf A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_misc_testing.cf A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_lotsa_money.cf A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_in_body.cf What do you get? Regards, KAM On 11/2/2017 5:29 AM, Merijn van den Kroonenberg wrote: I am a bit confused about corpus revision Take for example this: Revision: 1813664 Author: spamassassin_role Date: zondag 29 oktober 2017 3:47:00 Message: updated scores for revision 1813595 active rules added since last mass-check And in the sysadmin mail from last night (rescore example) svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk trunk-new-rules-set1 So the commit mentions revision 1813595, I assume its also mentioned in corpus logs and its actually checked out. BUT 1813595 is no valid revision for the spamassassin project?? Its actually a revision in another apcache project. Revision: 1813595 Author: deepak Date: zaterdag 28 oktober 2017 10:33:50 Message: Improved: Add rat exclude files to excludes those files that does not need license header (OFBIZ-9856) Updated rat-excludes.txt file Modified : /ofbiz/ofbiz-framework/trunk/rat-excludes.txt So is somewhere something wrong with detecting the correct revision? -Original Message- From: Kevin A. McGrail Sent: Wednesday, November 01, 2017 10:49 PM To: David Jones ; Merijn van den Kroonenberg Subject: Re: Eureka: truncation of 72_active.cf On 11/1/2017 5:31 PM, David Jones wrote: I found another bug in the DKIM_VALID_EF rule description that needed to be wrapped in a version check that was causing the ruleset validation to fail with return code 4. Just committed another fix that should take care of this. The timing of the rule promotions and the masscheck validation, this could take another ~40 hours to work itself out. I really want to improve the way things work to speed up this cycle time. Agreed. I am very excited to finally have this done though.