Justin Mason wrote:
Daryl C. W. O'Shea writes:
Justin Mason wrote:
Daryl C. W. O'Shea writes:
Justin Mason wrote:
At a minimum I'd disable the 3.2 updates until we figure out exactly
what we want to do (actually that's what I intended).
OK, sounds fine.
So, to do so, I'd just create an update that contains an identical
ruleset to the one being released with 3.2.0, using say 533971 as the
update number and then add the appropriate txt record to the zone file:
0.2.3.updates.spamassassin.org. 3600 IN TXT "533971"
Actually, the cron jobs run as "updatesd" on the zone would quickly
move on and create another update tomorrow, so that wouldn't help. ;)
I assumed (good start eh) that there was a *.2.3 wildcard like 3.1 has,
so a 0.2.3 record would have worked.
I've edited the 'build/mkupdates/run_part2' script, which is run by
cron, and set it to write the updates to a different dir (
http://buildbot.spamassassin.org/updatedev/ ) and to a different
record in the DNS zone ( 0.2.3.updates.dev.spamassassin.org ).
The existing files/records will remain the same until we edit the
script again and uncomment the "live" lines; in the meantime, we'll
be able to see what *would* be going into the updates by checking
those locations.
Should we be rolling an update that contains just the base release rules
and scores or are we happy with leaving the current update with most new
rules scoring 1.0 for now. I'm not sure myself... I'm suffering from a
case of "I'm going to use the dev version with generated scores anyway".
On top of that, there were lint failures caused (iirc) by plugin
dependencies being missed; and false positives caused by too-high scores,
when multiple rules had good 'freqs' but were matching the same message
subset. (mkrules already takes care of the former, and running a rescorer
like the GA would have fixed the latter.)
FWIW, the rescoring I've got running now seems to be producing sane
scores. I've been using the generated set1 scores for set3 with no
complaints so far.
Good news. (it should be measurable anyway, since we can run set3
mass-checks with those scores against our corpora.)
Actually if we're going to start enabling bayes in our mass-checks we
can just generate scores for set2 and set3 (and still generate set0 and
set1 from the same logs).
[snip]
Regardless of what the updates are based on, really, there should be a
step in the automated process that attempts to lint the update against
every version that the update is targeted for. In the event of a
failure the update won't be published (maybe it already does this, but
I'd don't remember it doing so).
I was actually thinking of this model:
- 3.1.x updates: remain as they are, manually updated.
- the current nightly auto-update generation switches to generating
updates for 3.2.x; and our nightly mass-checks are m-c'd using the code
in the 3.2.x maintainance branch; in other words, *rule* development
runs against 3.2.x, not against trunk.
- trunk has no nightly mass-check. (at least initially)
I was thinking the same thing. So as long as we move to 3.2, rather
than trunk, nightly mass-checks before resuming updates this (getting
the update generation process to lint the update against targeted
releases) is mainly just QA paranoia enhancement.
Daryl