updateDNS.sh on sa-vm1.apache.org - DNS updates disabled

2017-11-02 Thread noreply

3.3.3.updates (TXT) -> \"1814048\"

File /usr/local/bin/updateDNS.disabled exists, not updating DNS.


Re: Eureka: truncation of 72_active.cf

2017-11-02 Thread Merijn van den Kroonenberg
> That is an interesting find.  I am not surprised though.  It's very
> possible that the rules promotion processing script needs to be looked at
> too based on your finding.  The time I spent learning how the scripts run
> back in May and June showed several deficiencies that need to be addressed
> once we get the current 72_scores.cf issue resolved, hopefully soon.
>
>
> Today if you commit a ruleset change during most hours of the day, it will
> throw off the the current masscheck processing making it invalid.  The

I have not looked at this part of the process yet, but I would be
interested in what the effects are when this happens and what breaks.

> masscheck rsync area will then have to be updated the next day and then
> another day for the masscheckers to validate rules and build the ruleset.
> There are only a few hours in the day that commits can happen that won't
> cause this problem.  Their should be an easy way to lock in the tagged
> revisions correctly, linked to the masscheck rsync revision so they keep
> building rulesets every day.

I see the backend/nitemc/svn_checkout script just checks out trunk without
any revision of tag:
svn co http://svn.apache.org/repos/asf/spamassassin/trunk

It is called by: backend/nitemc/run_all
which I assume kicks off the nightly masscheck.

But then masses/rule-qa/corpus-nightly updates to the latest revision from
nightly-versions.txt
for each corpus?
And probably using the same working copy?

But I don't know yet whats actually happening in that working copy, the
whole corpus stuff is till in the realm of magic for me at the moment ;)

>
> 
> From: Merijn van den Kroonenberg 
> Sent: Thursday, November 2, 2017 4:29 AM
> To: David Jones; Kevin A. McGrail
> Subject: Re: Eureka: truncation of 72_active.cf
>
> I am a bit confused about corpus revision
>
> Take for example this:
>
> Revision: 1813664
> Author: spamassassin_role
> Date: zondag 29 oktober 2017 3:47:00
> Message:
> updated scores for revision 1813595 active rules added since last
> mass-check
>
> And in the sysadmin mail from last night (rescore example)
>
> svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk
> trunk-new-rules-set1
>
> So the commit mentions revision 1813595, I assume its also mentioned in
> corpus logs and its actually checked out.
>
> BUT 1813595 is no valid revision for the spamassassin project??
>
> Its actually a revision in another apcache project.
>
> Revision: 1813595
> Author: deepak
> Date: zaterdag 28 oktober 2017 10:33:50
> Message:
> Improved: Add rat exclude files to excludes those files that does not need
> license header
> (OFBIZ-9856)
>
> Updated rat-excludes.txt file
> 
> Modified : /ofbiz/ofbiz-framework/trunk/rat-excludes.txt
>
> So is somewhere something wrong with detecting the correct revision?
>
>
> -Original Message-
> From: Kevin A. McGrail
> Sent: Wednesday, November 01, 2017 10:49 PM
> To: David Jones ; Merijn van den Kroonenberg
> Subject: Re: Eureka: truncation of 72_active.cf
>
> On 11/1/2017 5:31 PM, David Jones wrote:
>>
>> I found another bug in the DKIM_VALID_EF rule description that needed to
>> be wrapped in a version check that was causing the ruleset validation to
>> fail with return code 4.  Just committed another fix that should take
>> care
>> of this.
>>
>>
>> The timing of the rule promotions and the masscheck validation, this
>> could
>> take another ~40 hours to work itself out.  I really want to improve the
>> way things work to speed up this cycle time.
>>
> Agreed.  I am very excited to finally have this done though.
>
>




Re: Eureka: truncation of 72_active.cf

2017-11-02 Thread Dave Jones
Well crap.  I found another odd dependency that throws off the masscheck 
processing thrown off when commits are done outside the few hour window:


https://wiki.apache.org/spamassassin/InfraNotes2017#nitemc

automc@sa-vm1:~$ ~/svn/trunk/build/mkupdates/run_nightly
+ promote_active_rules
+ pwd
/usr/local/spamassassin/automc/svn/trunk
+ /usr/bin/perl build/mkupdates/listpromotable
HTTP get: http://ruleqa.spamassassin.org/1-days-ago?xml=1
no 'mcviewing', 'mcsubmitters' microformats on day 1
URL: http://ruleqa.spamassassin.org/1-days-ago?xml=1
+ exit 25

This is a critical cron job that sets up the new rule promotions daily. 
I will dig into this one deeper but it seems that if the masscheck SA 
revision gets out of sync with new commits that may not even be 
ruleset-related, then days have to pass with no commits before the stars 
will align properly again.  Geez.  What a mess with this workflow!


I think we need to carefully document the current SVN workflow and 
redesign it to handle this better.  The masscheck processing of rulesets 
only needs to be tied to the ruleset revision staged in the rsync area 
for the current 24 hour period.  Maybe we need a new masscheck-specific 
tag separate from the rule promotion tags today?


Rule promotions can happen at any time, multiple times a day as long as 
they pass a lint check.


Masscheck validation are currently only done once a day.

My intial thoughts are parts of the scripts are using the latest SVN 
revision and part are using the latest tagged revision from rule 
promotions.  When these get out of sync, we don't get enough 
masschecking of the proper revision to keep moving everything forward.


Dave


On 11/02/2017 07:46 AM, Kevin A. McGrail wrote:

On 11/2/2017 8:40 AM, Merijn van den Kroonenberg wrote:
The checkout works as you would expect. But its just very confusing a 
revision is used which is not inside the spamassassin project.
It might also cause side effects in other part of the process as David 
mentioned. But the check out part is not actually broken.


I do have some considerable experience with subversionthe problem 
is more what the intention of the code should be ;)


My recommendation is do not try and unravel the thought process behind 
the code.  Stay focused on the goal which is to produce rules and 
distribute them.  Anything you do towards that goal is good.  If we 
break some eggs to make some omelets, great.


Ideally, I would like to publish more daily rulesets, focus on 
optimization, etc.


Regards,

KAM





Re: mailspike in 72_scores

2017-11-02 Thread Dave Jones

On 11/02/2017 08:11 AM, Kevin A. McGrail wrote:

On 11/2/2017 6:44 AM, Merijn van den Kroonenberg wrote:
While debugging the score generation problem, I encountered another 
weird case in 72_scores.cf.
It is about the mailspike rules which are in 50_scores so should not 
be in 72_scores.

There was already an existing issue about this:
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6400
I found the cause of this and attached a patch to bugzilla. but since 
its already a very old issue I wonder if someone will ever look it it 
anymore ;)
Thanks!  I did have a look at it but I don't want to interfere with 
David's work.  David, could you make the commit for him on this bug?




I certainly will.  I would like to wait a couple of days if that is OK 
since committing now could throw off the masscheck validations that we 
need to get sa-updates started again.


Dave


Regards,
KAM





Optimize score-ranges-from-freqs (needless run of parse-rules-for-masses)

2017-11-02 Thread Merijn van den Kroonenberg
Hi,

This is about /masses and intended as a possible future todo, so i'll just
drop it in this list.

The makefile contains:

tmp/rules_${SCORESET}.pl: tmp/.created ../build/parse-rules-for-masses
perl ../build/parse-rules-for-masses -d $(RULES) -s $(SCORESET) \
-o tmp/rules_${SCORESET}.pl

Which generates for example a rules_0.pl.

In the same makefile is:

tmp/ranges.data: tmp/.created freqs score-ranges-from-freqs
perl add-hitless-active-to-freqs
perl score-ranges-from-freqs $(RULES) $(SCORESET) < freqs
perl lock-scores 1
mv tmp/ranges.data-new tmp/ranges.data

As you can see it runs score-ranges-from-freqs. In score-ranges-from-freqs
is this code:

my $tmpf = "tmp/rules$$.pl";
system "../build/parse-rules-for-masses ".
  "-d \"$argcffile\" -s $scoreset -o $tmpf" and die;
require $tmpf;
unlink $tmpf;

Which is doing the same thing.

So I would say change the makefile to add dependency
tmp/rules_${SCORESET}.pl to tmp/ranges.data.

And then use the tmp/rules_${SCORESET}.pl directly from
score-ranges-from-freqs.

And to be consistent also add dependencies add-hitless-active-to-freqs
lock-scores to tmp/ranges.data as they are all needed.

Cheers,
Merijn



Re: mailspike in 72_scores

2017-11-02 Thread Kevin A. McGrail

On 11/2/2017 6:44 AM, Merijn van den Kroonenberg wrote:
While debugging the score generation problem, I encountered another 
weird case in 72_scores.cf.
It is about the mailspike rules which are in 50_scores so should not 
be in 72_scores.

There was already an existing issue about this:
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6400
I found the cause of this and attached a patch to bugzilla. but since 
its already a very old issue I wonder if someone will ever look it it 
anymore ;)
Thanks!  I did have a look at it but I don't want to interfere with 
David's work.  David, could you make the commit for him on this bug?


Regards,
KAM


Re: Eureka: truncation of 72_active.cf

2017-11-02 Thread Merijn van den Kroonenberg
The checkout works as you would expect. But its just very confusing a 
revision is used which is not inside the spamassassin project.
It might also cause side effects in other part of the process as David 
mentioned. But the check out part is not actually broken.


I do have some considerable experience with subversionthe problem is 
more what the intention of the code should be ;)


-Original Message- 
From: Kevin A. McGrail

Sent: Thursday, November 02, 2017 1:27 PM
To: Merijn van den Kroonenberg ; David Jones ; 
sysadmins@spamassassin.apache.org

Subject: Re: Eureka: truncation of 72_active.cf

+sysadmins@s.a.o so we don't lose these conversations.

Consider asking gst...@apache.org for his help on SVN.  I simply don't
play with enough tags, branches, revisions, etc.

Either there is a bug and we are injecting the wrong version, or you are
just looking at things incorrectly.

However, I think you perhaps are just doing something wrong??

This command, for example, appears to pull an SA rule set.  Isn't that
the expected behavior?

svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk
trunk-new-rules-set1  | more
Atrunk-new-rules-set1/rulesrc
Atrunk-new-rules-set1/rulesrc/10_force_active.cf
Atrunk-new-rules-set1/rulesrc/sandbox
Atrunk-new-rules-set1/rulesrc/sandbox/jhardin
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_no_text.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_misc_testing.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_lotsa_money.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_in_body.cf

What do you get?

Regards,
KAM

On 11/2/2017 5:29 AM, Merijn van den Kroonenberg wrote:

I am a bit confused about corpus revision

Take for example this:

Revision: 1813664
Author: spamassassin_role
Date: zondag 29 oktober 2017 3:47:00
Message:
updated scores for revision 1813595 active rules added since last 
mass-check


And in the sysadmin mail from last night (rescore example)

svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk 
trunk-new-rules-set1


So the commit mentions revision 1813595, I assume its also mentioned in 
corpus logs and its actually checked out.


BUT 1813595 is no valid revision for the spamassassin project??

Its actually a revision in another apcache project.

Revision: 1813595
Author: deepak
Date: zaterdag 28 oktober 2017 10:33:50
Message:
Improved: Add rat exclude files to excludes those files that does not need
license header
(OFBIZ-9856)

Updated rat-excludes.txt file

Modified : /ofbiz/ofbiz-framework/trunk/rat-excludes.txt

So is somewhere something wrong with detecting the correct revision?


-Original Message- From: Kevin A. McGrail
Sent: Wednesday, November 01, 2017 10:49 PM
To: David Jones ; Merijn van den Kroonenberg
Subject: Re: Eureka: truncation of 72_active.cf

On 11/1/2017 5:31 PM, David Jones wrote:


I found another bug in the DKIM_VALID_EF rule description that needed to 
be wrapped in a version check that was causing the ruleset validation to 
fail with return code 4.  Just committed another fix that should take 
care of this.



The timing of the rule promotions and the masscheck validation, this 
could take another ~40 hours to work itself out.  I really want to 
improve the way things work to speed up this cycle time.



Agreed.  I am very excited to finally have this done though.





Re: Eureka: truncation of 72_active.cf

2017-11-02 Thread Kevin A. McGrail

On 11/2/2017 8:40 AM, Merijn van den Kroonenberg wrote:
The checkout works as you would expect. But its just very confusing a 
revision is used which is not inside the spamassassin project.
It might also cause side effects in other part of the process as David 
mentioned. But the check out part is not actually broken.


I do have some considerable experience with subversionthe problem 
is more what the intention of the code should be ;)


My recommendation is do not try and unravel the thought process behind 
the code.  Stay focused on the goal which is to produce rules and 
distribute them.  Anything you do towards that goal is good.  If we 
break some eggs to make some omelets, great.


Ideally, I would like to publish more daily rulesets, focus on 
optimization, etc.


Regards,

KAM



Re: Eureka: truncation of 72_active.cf

2017-11-02 Thread Kevin A. McGrail

+sysadmins@s.a.o so we don't lose these conversations.

Consider asking gst...@apache.org for his help on SVN.  I simply don't 
play with enough tags, branches, revisions, etc.


Either there is a bug and we are injecting the wrong version, or you are 
just looking at things incorrectly.


However, I think you perhaps are just doing something wrong??

This command, for example, appears to pull an SA rule set.  Isn't that 
the expected behavior?


svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk 
trunk-new-rules-set1  | more

A    trunk-new-rules-set1/rulesrc
A    trunk-new-rules-set1/rulesrc/10_force_active.cf
A    trunk-new-rules-set1/rulesrc/sandbox
A    trunk-new-rules-set1/rulesrc/sandbox/jhardin
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_no_text.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_misc_testing.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_lotsa_money.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_in_body.cf

What do you get?

Regards,
KAM

On 11/2/2017 5:29 AM, Merijn van den Kroonenberg wrote:

I am a bit confused about corpus revision

Take for example this:

Revision: 1813664
Author: spamassassin_role
Date: zondag 29 oktober 2017 3:47:00
Message:
updated scores for revision 1813595 active rules added since last 
mass-check


And in the sysadmin mail from last night (rescore example)

svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk 
trunk-new-rules-set1


So the commit mentions revision 1813595, I assume its also mentioned 
in corpus logs and its actually checked out.


BUT 1813595 is no valid revision for the spamassassin project??

Its actually a revision in another apcache project.

Revision: 1813595
Author: deepak
Date: zaterdag 28 oktober 2017 10:33:50
Message:
Improved: Add rat exclude files to excludes those files that does not 
need

license header
(OFBIZ-9856)

Updated rat-excludes.txt file

Modified : /ofbiz/ofbiz-framework/trunk/rat-excludes.txt

So is somewhere something wrong with detecting the correct revision?


-Original Message- From: Kevin A. McGrail
Sent: Wednesday, November 01, 2017 10:49 PM
To: David Jones ; Merijn van den Kroonenberg
Subject: Re: Eureka: truncation of 72_active.cf

On 11/1/2017 5:31 PM, David Jones wrote:


I found another bug in the DKIM_VALID_EF rule description that needed 
to be wrapped in a version check that was causing the ruleset 
validation to fail with return code 4.  Just committed another fix 
that should take care of this.



The timing of the rule promotions and the masscheck validation, this 
could take another ~40 hours to work itself out.  I really want to 
improve the way things work to speed up this cycle time.



Agreed.  I am very excited to finally have this done though.