Until we have a good score file, while publish the file anyway? DNS should
stay with the last known good file.

I would generate the tick file some where temporary, then update it with
the tock data and publish/update DNS.

The mirrors shouldn't even see the temporary file.

Thoughts?

KAM

That

--
Kevin A. McGrail
Asst. Treasurer & VP Fundraising, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171

On Sun, Mar 25, 2018 at 12:07 PM, Dave Jones <da...@apache.org> wrote:

> On 03/20/2018 06:18 PM, Bill Cole wrote:
>
>> On 20 Mar 2018, at 16:18, Dave Jones wrote:
>> [...]
>>
>> I thought they were different numbers. They should be.  The SVN version
>>> number is shared across all Apache projects using SVN so hours later there
>>> should be a different SVN commit number between the tick and tock.
>>>
>>
>> But mkupdate-with-scores doesn't use the SVN revision number. It uses the
>> highest revision number in the 9th single-space-delimited token of the
>> first line of any of the trunk/rulesrc/scores/scores-set* files. That is
>> more than slightly fragile, but it isn't broken yet.
>>
>> The goal of this seems to be to get a score set that is evolved using the
>> current active rule set. From that perspective, it makes sense to tag the
>> later update bundle with revision number of the active rule set. It seems
>> less sensible to be releasing the earlier (run-nightly) update bundle at
>> all, since it is essentially an interim state between logically coherent
>> set of rules and scores.
>>
>> Which raises the issue of whether anyone has ever tried a rigorous
>> comparison between the GA and Perceptron rescoring models. If the reason
>> for publishing an interim update and a revised one after an 18hr pause gap
>> is how long GA is taking to run, we may want to examine how much accuracy
>> we are really buying with GA.
>>
>
> I looked into this more this morning and now I am not sure I want to
> change anything without other's consensus.  A year or so ago, we didn't
> have consistent masscheck updates like we have the past 8+ months so only
> the "tick" would happen regularly for basic rule updates with the same/last
> 72_scores.cf.  Now that we have a consistent "tock" this issue has been
> brought to light.
>
> I could remove the 72_scores.cf from the "tick" like it was before but
> that still leaves a hole for fresh installs of SA that run an sa-update to
> receive the first ruleset that may not contain a 72_scores.cf.  If
> another sa-update is never run, then that SA instance could have some very
> odd scoring with many rules defaulting to 1.0.
>
> Thoughts?  Leave it as is until we can examine the GA rescorer? I
> understand we have the same ruleset updating twice a day now but it seems
> to have been OK the past 8+ months.
>
> Dave
>
>

Reply via email to