https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7021
--- Comment #4 from Ivo Truxa <[email protected]> --- (In reply to Kevin A. McGrail from comment #1) > Add this to cron: > > DELETE FROM awl WHERE lastupdate <= (now() - INTERVAL 15 day) and count < 5; > DELETE FROM awl WHERE lastupdate <= (now() - INTERVAL 30 day) and count < 10; > DELETE FROM awl WHERE lastupdate <= (now() - INTERVAL 60 day) and count < 20; > DELETE FROM awl WHERE lastupdate <= (now() - INTERVAL 120 day); BTW, not that I would be against the expiry as such, but wonder what is the motivation to expire AWL records in such aggressive way as shown above? If someone is removing all senders with less than 5 messages in the last 15 days, and removing absolutely everything older than 4 months, then I wonder whether the AWL system can be of any help for him at all. In fact the system then only keeps rather recent records of senders mailing almost daily. It means either regular spammers or hammers who'd be probably better white/blacklisted on a more consistent way (manually, tuned rules, Bayes). I may be wrong, but for me the ability of AWL to prevent an occasional good sender hitting a false positive is more important than handling regular senders, where Bayes, rules, and white/blacklist are already certainly tuned to handle them correctly. So by continuously dumping practically the entire AWL database, you are losing a significant part of the functionality AWL provides. Perhaps at servers handling millions of emails monthly, the size of the database and its performance become an issue, but unless it is really the case, I would rather advise keeping the records as long as possible. Finally, customers or friends mailing back after a few years are nothing exceptional, and it is especially them, who can fall victims of false positives (rules change over time), and where their recorded score would help. I can see a possible reason for expiring all records after 120 days perhaps in trying to make AWL better adapt to continuously changing rules. This is unneeded with TxRep, because it works differently. First of all it has the ability to learn messages, and also auto-learning is available, hence clear spam/ham can be learned or relearned anytime the rules change. And then, new messages are always stored with higher weight than old ones, meaning that the influence of old messages vanishes over time automatically, without the need to delete any records. -- You are receiving this mail because: You are the assignee for the bug.
