Re: Rule updates after 3.3.0

Justin Mason Tue, 29 Dec 2009 05:17:41 -0800

On Mon, Dec 28, 2009 at 02:18, Warren Togami <[email protected]> wrote:
> After the release of 3.3.0 we need to think about how rule updates as
> distributed via sa-update will work.  The goal here is to make it quick and
> easy to safely add new or adjust existing rules so sa-update keeps
> spamassassin effective over time.  This extends the useful life-span of a
> spamassassin release.  We can then propose a 3.3.x maintenance release only
> after we feel enough worthwhile changes make it worthwhile to do a release,
> or for security releases.
>
> jm explained a few weeks ago that currently 3.2.x sa-update rule updates are
> not auto-updated because we lack a separate ruleqa system.  Our ruleqa
> system tests only the svn trunk in the nightly masscheck.  It would be too
> much for our nightly masscheck volunteers to run the nightly masscheck
> twice, so doing both is not an option.
>
> In talking with jm a few weeks ago, we seem to be in agreement that we
> should change this procedure for 3.3.x.  Nightly masscheck will continue to
> check using the svn trunk, but rule updates will be pushed to 3.3.x users.
>
> Rule Version Conditionals
> =========================
> jm says he added a conditional system that might allow us to mark certain
> rules as compatible with a certain version of spamassassin. This will allow
> us to add new types of rules to trunk without breaking 3.3.x rule updates.
>  Is there any documentation for these rule conditionals?


perldoc Mail::SpamAssassin::Conf --

       if (boolean perl expression)
           can(Name::Of::Package::function_name)
               This is a function call that returns 1 if the perl package named
               "Name::Of::Package" includes a function called
"function_name", or "undef"
               otherwise.  Note that packages can be SpamAssassin
plugins or built-in
               classes, there's no difference in this respect.


we then ensure that rule-breaking changes need to include a method that
can be used by rules using this method.  e.g.

    ChangedPlugin.pm

        sub has_new_feature { 1; }

    rulesfile.cf

        if can(Mail::SpamAssassin::Plugin::ChangedPlugin::has_new_feature)
        [...new rules...]
        else
        [...backwards compat...]
        endif


We also need to add a build to Hudson to build 3.3.x maintainance using trunk's
rules, and run the tests, to ensure that the maint branch works ok with trunk's
rules.


> With rule version conditionals we might consider that svn trunk targets the
> next 3.3.x maintenance release instead of working on a branch.  We have
> limited developer hours so we might be better off focusing exclusively on
> trunk.  This worked reasonably well during the past year with pre-3.3.0
> trunk.  Any thoughts about this part?

I'm -1 on this idea, however.   We've previously always switched to a
maintainance branch for post-release fixes, and it's easy enough.

The benefit is that new features/code that aren't suitable for the maint
releases can easily be put into trunk; otherwise there's a temptation to either

    1. shoehorn them into a maint release when they're not ready, bad

    2. or stick them in a dev branch that gets quickly forgotten/goes bad

Those are better avoided.

In practice, switching to a 3.3.x maint branch for future 3.3.x releases/
updates is very low-overhead.  it's just a matter of typing

        svn sw https://.... https://.....

in your SVN checkout directory.


> Explicit Promotion
> ==================
> The ruleqa system periodically has problems where it gets stuck having
> processed only the bb-* corpora but not others.  This seems to cause the
> combined results to swing wildly and rules are promoted and demoted for
> seemingly no reason.

Suggestion: rule promotion/demotion requires a certain "quorum" of both bb-* and
non-bb* corpora to happen.  It already requires a quorum of N corpora (of any
type).  If it doesn't meet this, the existing promoted rules list is kept as-is.


> The ruleqa system is incapable of auto-promoting rare hitting but
> ultra-accurate rules like VANITY.

yes, definitely a good candidate for force-active...

> For reasons like this, we should force active certain rules when we're
> certain they are safe.  Adding the rule to rulesrc/10_force_active.cf seems
> to be sufficient.
>
> I propose that we have simple, low bar of requirements to govern explicit
> promotion.
>
> * By judgement call the rule is obviously safe, or proven by ruleqa.
> * Any two commiters agree.
> * No bug required, but state who agreed in the commit.

+1

> Scoring
> =======
> Currently auto-promoted rules all have the score of 1.  Scores need to be
> defined in rules/50_scores.cf to have any other score.
>
> I propose that we have simple, low bar of requirements to control assignment
> of any score greater than 1.
>
> * One committer per point must agree, rounded up.  (1.4 points require two
> committers to agree.  2.3 points require three.)
> * No bug required, but state who agreed in the commit.

I think it's a good idea, but I'm worried about two things:

    - it'll take a lot of overhead in wrangling voters; 3 voters may be too
      much.  I'd be happy with just 2, since we can always retrospectively veto
      in cases where we disagree.

    - Daryl, thoughts regarding the weekly run of the GA?  is that workable yet?
      this proposed system is incompatible with that.

JH:
>   I was hoping that at least some sort of automatic analysis for assigning
>   scores could be incorporated into the process. Is the consensus that the
>   nightly masscheck corpus isn't large enough to support doing this?

Warren:
> That would be ideal, but yes, the nightly masscheck is WAY too small. Even our
> mcsnapshot was too small and required lots of manual massaging to output
> scores that satisfied us.

if I recall correctly, the initial plans for the weekly-GA was that it would
only generate scores for newly-defined rules in the sandboxes.  If the "base",
non-sandbox ruleset had stable, infrequently-changed scores, and the sandbox
rules were more in flux, that insulates us against the manual-massaging problem.

Anyway, that really needs a comment from Daryl ;)

-- 
--j.

Re: Rule updates after 3.3.0

Reply via email to