Re: Rules updates

Matt Kettler Tue, 08 Jun 2010 21:47:22 -0700

On 6/8/2010 11:22 PM, Alex wrote:
> Hi,
>
>   
>>  We also very loudly repeatedly state on the list that if you want to
>> keep abreast of the latest spam, you need to be running the latest
>> version of the codebase (can't take advantage of new features without
>> it!), but don't have that clearly documented either.
>>     
> It would be great if you could document exactly what features are
> exclusively available in 3.3.x? In other words, can you quantify how
> much is being missed by continuing to use v3.2.5?
>
> I've read the release notes for v3.3, but I see very little in the
> "MAIN NEW FEATURES" of the release notes that I can see would itself
> result in a marked improvement in catch ratio.
>


Yes, that's largely focused on the things that will matter to
configuration, comparability, etc.

> I know it's obviously a good idea to upgrade, but in a nutshell, how
> much different is one version from the other?
>   

Historically a change in the second number of the SA release (ie: 3.2.x
to 3.3.x) is generally where we introduce radical changes to the
ruleset. Right before a x.y.0 release, a large batch of new rules are
added in, and all rules are up on the chopping block for elimination.
This is also when a whole new scoreset gets generated (historically a
long, slow process that took a lot of CPU time). Sa-updates in 3.1.x
will add and remove some rules, change a few scores, etc, but typically
only as needed. 3.2.x expanded that a bit, and we did a few rescoring
runs. 3.3 is taking a newer model where we're regenerating the scores
much more often (thanks in part to the faster perceptron scoring tool
introduced  in 3.2).

So if you really want a big shift in rules, versions are traditionally
where that happens. Sa-updates will tend to push the best and the
brightest new rules, but many of the "good but not great" rules wait
until a version update (where they can vie for space in the ruleset in a
competitive deathmatch with other rules... :-)

As for the  release notes, these bits would readily affect accuracy:

Main New Features section:
(ipv6, dkim and AWL changes may matter if present in your environment)

Rules section:
- new scores were generated by a genetic algorithm (GA) and then manually
  tweaked based on cleaned datasets supplied by a dozen volunteers;

- dropped redundant rules or rules causing too many false positives;

- added or updated many rules;

Plugin section:
- new plugins: FreeMail, PhishTag, Reuse;

Bug fixes:
- fixed some cases where :addr headers were parsed incorrectly

- fixed leakage of 'whitelist_from_rcvd' entries between spamd users;

- the 'exists:' evaluator in HEADER rules now works as documented
  and tests for existence of a header field, instead of testing for
  a header field body being nonempty; internally, the pms->get can
  also now distinguish between empty and nonexistent header fields;

- applied fixes to header fields parsing in several places: header field
  names are case-insensitive, whitespace is not required after a colon,
  obsolete rfc822 syntax allowed whitespace before a colon;
  VBounce: match "Received:" only at the beginning of a line;

- fixed parsing of multi-line Received header fields for
  BOUNCE_MESSAGE/VBOUNCE_MESSAGE et al

> Thanks,
> Alex
>
>

Re: Rules updates

Reply via email to