SARE's HTML rule sets have been updated. All rules have been renamed and redescribed as necessary to prevent --lint warnings under version 3.0.0
A rule that duplicates a rule added to the 3.0.0 distribution rule set has been moved to a separate file. Rules that work well for English-based systems, but that might cause false positives have been moved to a separate file. Rules that used to work well but no longer hit spam have been archived. HTML rules from ratware.cf have been merged into this collection. There are several ruleset files in this collection (nine, counting the archive): 70_sare_html0.cf contains those HTML rules which in all SARE mass-check testing hit ONLY spam. This is the safest of the four HTML rulesets for use. Unlike 70_sare_html0.cf, the 70_sare_html1.cf ruleset contains rules which do (or in the past have) hit ham during SARE mass-check tests. The S/O calculated by SA's hit-frequencies scripts are all at or above 0.900. This file also contains rules which hit only spam, but fewer than 10 spam in our mass-check tests. Systems which are highly sensitive to false positives and/or tight on resources may want to exclude this ruleset, pick and choose among its rules, or lower their scores. Any system using this file 1 should also use file 0. 70_sare_html2.cf contains only rules which should never hit ham, but that do not currently hit any emails during SARE mass-check testing against current corpora. Therefore, systems which are very sensitive to SpamAssassin overhead may want to exclude this ruleset to avoid its regex overhead. 70_sare_html3.cf contains a subset of HTML rules which hit a significant amount of ham during SARE mass-check tests. Systems which are very sensitive to false positives should probably NOT install this ruleset. 70_sare_html4.cf contains a subset of HTML rules which hit over 100 ham during SARE mass-check tests. Systems which are very sensitive to false positives should probably NOT install this ruleset. 70_sare_html_eng.cf contains HTML rules which work well where English is the only expected language, but that may cause false positives in systems which receive a significant number of emails in other languages. 70_sare_html_x30.cf contains HTML rules which have been merged into the SpamAssassin distribution rule set in version 3.0.0. Systems which have upgraded to this version should not use this file. Systems running any 2.5x or 2.6x version should benefit from these rules. 70_sare_html_arc.cf contains HTML rules which used to work, but which no longer hit spam, or that hit too many ham. SARE will continue testing these rules and will reactivate any that again become productive. This rules file should be used only by the most aggressive systems with available resources. As announced two months ago, the old coding_html.cf rule set file has now been deleted. Bob Menschel
