http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5722
Summary: hit-frequencies needs new ranking algorithm which likes
fresh spam
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Platform: Other
OS/Version: other
Status: NEW
Severity: minor
Priority: P5
Component: Masses
AssignedTo: [email protected]
ReportedBy: [EMAIL PROTECTED]
Rules which hit "fresh" spam currently fare badly in the hit-frequencies ranking
report.
Imagine a mass-check which contains 50k spam messages. 46k are from between 3
months and 1 week old, and the remaining 4k are fresher than 1 week old. a rule
that hits 10% of that "fresh" spam, therefore, hits only 0.8% of the overall
corpus -- which doesn't look so impressive compared to other rules. But because
it's hitting "fresh" spam, that's very useful for us.
We should try to come up with a new ranking algo which can take this into
account -- possibly by biasing against "old" spam, by treating a hit on old spam
as increasingly worth less than a hit on fresh spam.
It needn't bias against "old" ham, however, since ham doesn't have this issue.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.