Saturday, July 23, 2005, 8:36:58 PM, Duncan wrote:

DF>  * We discussed at length the ideas for the new rules project, and we
DF> came up with some ideas, which we're trying to track
DF> http://wiki.apache.org/spamassassin/RulesProjectPlan (Please give us
DF> your feedback)

Added to http://wiki.apache.org/spamassassin/RulesProjStreamlining :

> TODO: criteria for overlap with existing rules?
> BobMenschel: The method I used for weeding out SARE rules that
> overlapped 3.0.0 rules, was to run a full mass-check with overlap
> analysis, and throw away anything where the overlap is less than
> 50%. Manually reviewing the remaining (significantly) overlapping
> rules was fairly easy. The command I use is: perl ./overlap
> ../rules/tested/$testfile.ham.log ../rules/tested/$testfile.spam.log
> | grep -v mid= | awk ' NR == 1 { print } ; $2 + 0 == 1.000 && $3 + 0
> >= 0.500 { print } ' >../rules/tested/$testfile.overlap.out



Reply via email to