On 6/8/2010 11:22 PM, Alex wrote: > Hi, > > >> We also very loudly repeatedly state on the list that if you want to >> keep abreast of the latest spam, you need to be running the latest >> version of the codebase (can't take advantage of new features without >> it!), but don't have that clearly documented either. >> > It would be great if you could document exactly what features are > exclusively available in 3.3.x? In other words, can you quantify how > much is being missed by continuing to use v3.2.5? > > I've read the release notes for v3.3, but I see very little in the > "MAIN NEW FEATURES" of the release notes that I can see would itself > result in a marked improvement in catch ratio. >
Yes, that's largely focused on the things that will matter to configuration, comparability, etc. > I know it's obviously a good idea to upgrade, but in a nutshell, how > much different is one version from the other? > Historically a change in the second number of the SA release (ie: 3.2.x to 3.3.x) is generally where we introduce radical changes to the ruleset. Right before a x.y.0 release, a large batch of new rules are added in, and all rules are up on the chopping block for elimination. This is also when a whole new scoreset gets generated (historically a long, slow process that took a lot of CPU time). Sa-updates in 3.1.x will add and remove some rules, change a few scores, etc, but typically only as needed. 3.2.x expanded that a bit, and we did a few rescoring runs. 3.3 is taking a newer model where we're regenerating the scores much more often (thanks in part to the faster perceptron scoring tool introduced in 3.2). So if you really want a big shift in rules, versions are traditionally where that happens. Sa-updates will tend to push the best and the brightest new rules, but many of the "good but not great" rules wait until a version update (where they can vie for space in the ruleset in a competitive deathmatch with other rules... :-) As for the release notes, these bits would readily affect accuracy: Main New Features section: (ipv6, dkim and AWL changes may matter if present in your environment) Rules section: - new scores were generated by a genetic algorithm (GA) and then manually tweaked based on cleaned datasets supplied by a dozen volunteers; - dropped redundant rules or rules causing too many false positives; - added or updated many rules; Plugin section: - new plugins: FreeMail, PhishTag, Reuse; Bug fixes: - fixed some cases where :addr headers were parsed incorrectly - fixed leakage of 'whitelist_from_rcvd' entries between spamd users; - the 'exists:' evaluator in HEADER rules now works as documented and tests for existence of a header field, instead of testing for a header field body being nonempty; internally, the pms->get can also now distinguish between empty and nonexistent header fields; - applied fixes to header fields parsing in several places: header field names are case-insensitive, whitespace is not required after a colon, obsolete rfc822 syntax allowed whitespace before a colon; VBounce: match "Received:" only at the beginning of a line; - fixed parsing of multi-line Received header fields for BOUNCE_MESSAGE/VBOUNCE_MESSAGE et al > Thanks, > Alex > >