Re: ruleqa user llanga

2017-11-10 Thread Dave Jones

On 11/10/2017 09:01 AM, Merijn van den Kroonenberg wrote:



Day 2 doesn't have that table with "mcviewing".  The next question is
what is causing this problem.  Is it related to new commits that throw
off the masscheck processing?


The 2 days ago doesn't highlight a current masscheckbut still it
shows
a result at the bottom...so its showing *something*. I think its likely
it
is the masxcheck as present in the datrev input field:
20171108-r1814560-n
But that one isn't in any daterev liting, not even in the full listing.

So i think something in the ruleqa.cgi which builds the daterev list is
broken and leaves out some masschecks.
If I get the cachefile and the ddirectory listings I can go debug where
things go pear-shaped.



I have found one dubious piece of code where the masschecks are indexed
based on their svn rev number. But that is not an unique value has the
same revision  can be masschecked multiple times (by different
submitter/date).


I think this is in fact the case.
There is something weird with masscheck user llanga.
Either something is off with the timing of masscheck result submission or
that user submits the masscheck result twice (once more the next day for
the same revision).
I think thats what triggers the bug in the ruleqa page.

ls -l html/20171108/r1814560-n/LOGS.all-*-llanga*
5356811 Nov 10 01:05 html/20171108/r1814560-n/LOGS.all-ham-llanga.log.gz
521798 Nov 10 01:06 html/20171108/r1814560-n/LOGS.all-spam-llanga.log.gz

ls -l html/20171109/r1814560-n/LOGS.all-*-llanga*
5356811 Nov 10 08:12 html/20171109/r1814560-n/LOGS.all-ham-llanga.log.gz
521798 Nov 10 08:12 html/20171109/r1814560-n/LOGS.all-spam-llanga.log.gz

b14039f7b3ef3329d6bbd80e8a2eb5e04eb62129
html/20171108/r1814560-n/LOGS.all-ham-llanga.log.gz
b14039f7b3ef3329d6bbd80e8a2eb5e04eb62129
html/20171109/r1814560-n/LOGS.all-ham-llanga.log.gz

same checksum so same files.
The question is, does the user do something wrong or is some scripting
messed up (maybe related to bad timing or timezone issues).



Please see attached patch for masses/rulequa/ruleqa.cgi


I think i failed to attach patch correctly but send it directly to dave.



If this is not it then I suspect code around line 453 which trims some
revisions away. But its very hard to read code.





I think I figured out what was causing problems with the masscheck SVN 
revision getting thrown off by commits and llanga.  I was determining 
the $REVISION in masses/rule-update-score-gen/generate-new-scores.sh 
around line 123 by finding the newest SVN revision.  I thought the 
staging of the rsync dir and the SVN tagged versions would keep that SVN 
revision locked in for a 24 hour period.  Now I have updated the logic 
to find the SVN revision with the most occurrences in all of the corpus 
for that particular scoreset type.


It might work best if a tag file was dropped with the SVN revision by 
the run_nightly scrip that stages the masscheck area so the 
generate-new-scores.sh could be better matched to that SVN revision.  If 
an SVN command could be used to find the latest sa-update tagged 
version, then that could be used instead of a tag file.


--
Dave


updateDNS.sh on sa-vm1.apache.org - DNS updates disabled

2017-11-10 Thread noreply

3.3.3.updates (TXT) -> \"1814822\"

File /usr/local/bin/updateDNS.disabled exists, not updating DNS.


ruleqa user llanga

2017-11-10 Thread Merijn van den Kroonenberg
>
>>> Day 2 doesn't have that table with "mcviewing".  The next question is
>>> what is causing this problem.  Is it related to new commits that throw
>>> off the masscheck processing?
>>
>> The 2 days ago doesn't highlight a current masscheckbut still it
>> shows
>> a result at the bottom...so its showing *something*. I think its likely
>> it
>> is the masxcheck as present in the datrev input field:
>> 20171108-r1814560-n
>> But that one isn't in any daterev liting, not even in the full listing.
>>
>> So i think something in the ruleqa.cgi which builds the daterev list is
>> broken and leaves out some masschecks.
>> If I get the cachefile and the ddirectory listings I can go debug where
>> things go pear-shaped.
>>
>
> I have found one dubious piece of code where the masschecks are indexed
> based on their svn rev number. But that is not an unique value has the
> same revision  can be masschecked multiple times (by different
> submitter/date).

I think this is in fact the case.
There is something weird with masscheck user llanga.
Either something is off with the timing of masscheck result submission or
that user submits the masscheck result twice (once more the next day for
the same revision).
I think thats what triggers the bug in the ruleqa page.

ls -l html/20171108/r1814560-n/LOGS.all-*-llanga*
5356811 Nov 10 01:05 html/20171108/r1814560-n/LOGS.all-ham-llanga.log.gz
521798 Nov 10 01:06 html/20171108/r1814560-n/LOGS.all-spam-llanga.log.gz

ls -l html/20171109/r1814560-n/LOGS.all-*-llanga*
5356811 Nov 10 08:12 html/20171109/r1814560-n/LOGS.all-ham-llanga.log.gz
521798 Nov 10 08:12 html/20171109/r1814560-n/LOGS.all-spam-llanga.log.gz

b14039f7b3ef3329d6bbd80e8a2eb5e04eb62129 
html/20171108/r1814560-n/LOGS.all-ham-llanga.log.gz
b14039f7b3ef3329d6bbd80e8a2eb5e04eb62129 
html/20171109/r1814560-n/LOGS.all-ham-llanga.log.gz

same checksum so same files.
The question is, does the user do something wrong or is some scripting
messed up (maybe related to bad timing or timezone issues).

>
> Please see attached patch for masses/rulequa/ruleqa.cgi

I think i failed to attach patch correctly but send it directly to dave.

>
> If this is not it then I suspect code around line 453 which trims some
> revisions away. But its very hard to read code.




Re: Cron <automc@sa-vm1> ~/svn/trunk/build/mkupdates/run_nightly | /usr/bin/tee /var/www/automc.spamassassin.org/mkupdates/mkupdates.txt

2017-11-10 Thread Merijn van den Kroonenberg

>> Day 2 doesn't have that table with "mcviewing".  The next question is
>> what is causing this problem.  Is it related to new commits that throw
>> off the masscheck processing?
>
> The 2 days ago doesn't highlight a current masscheckbut still it shows
> a result at the bottom...so its showing *something*. I think its likely it
> is the masxcheck as present in the datrev input field: 20171108-r1814560-n
> But that one isn't in any daterev liting, not even in the full listing.
>
> So i think something in the ruleqa.cgi which builds the daterev list is
> broken and leaves out some masschecks.
> If I get the cachefile and the ddirectory listings I can go debug where
> things go pear-shaped.
>

I have found one dubious piece of code where the masschecks are indexed
based on their svn rev number. But that is not an unique value has the
same revision  can be masschecked multiple times (by different
submitter/date).

Please see attached patch for masses/rulequa/ruleqa.cgi

If this is not it then I suspect code around line 453 which trims some
revisions away. But its very hard to read code.

Re: Eureka: truncation of 72_active.cf

2017-11-10 Thread Dave Jones

On 11/09/2017 09:45 AM, Merijn van den Kroonenberg wrote:

solved


/usr/local/spamassassin/automc/svn/trunk/build/mkupdates/do-stable-update-with-scores

should be calling

/usr/local/spamassassin/automc/svn/masses/rule-update-score-gen/do-nightly-rescore-example.sh



Here you actually explain why your changes are not effective. I checked
what you say here and you are right.

basically what happens is the code execution jumps from
/usr/local/spamassassin/automc/svn/trunk
to
/usr/local/spamassassin/automc/svn/masses

Your changes are in /usr/local/spamassassin/automc/svn/trunk as you showed
below. And /usr/local/spamassassin/automc/svn/masses has no code changes.

So nightly-rescore-example.sh is executed from the separate masses
checkout and not from the trunk checkout. (no idea why there is a separate
masses checkout)



First, let me say that this was all a huge mess to figure out back in 
April.  There is an Apache config file that is used under the masses so 
that is a RO checkout.  This all used to run from different home dirs on 
multiple servers.


At first, I was going to have all of the scripts run from their own RO 
checkout dir and cron an "svn up" just before each runs but I found some 
scripts that needed to do commits.  Then my plan switched to having 
everything run from the trunk dir.


Thanks for finding that path that I missed.  I will fix it now.

Dave