https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7797
--- Comment #3 from Bill Cole <[email protected]> --- TL;DR Summary: -1, I'm on the verge of closing this as INVALID or WONTFIX, but want feedback from other SA contributors. The original Debian bug references "spamass-milter" which is not part of the SpamAssassin project and never has been. It suggests as a fix, a change in /etc/default/spamass-milter, which is Debian-specific. The reasoning behind the 500k default limit is not just about simple basic resource limitations like memory or disk space, it is also about the nature of spam and of how SA operates. While it has long been true that the majority of all messages are spam, the overwhelming majority of messages over 100KB are not spam and spam messages over 500KB are both relatively uncommon and hard for SA to identify even if the limit is raised. Very large spams are less rare than in the past but they are largely unlike most smaller spam, in that they are mostly "phishing" messages carrying malware attachments which are carefully designed to look like high-value types of mail such as messages with attached business invoices or legal documents. Because SA is not designed to be nor intended to be a general malware filter, it is not good at distinguishing between a megabyte of innocent binary data and a megabyte of malicious code. There has also been very little development of SA rules to catch any of the other types of spam that comprise a substantial fraction of the sparse realm of big spam. Beyond that problem of there being little spam to catch above 500k and SA being bad at catching the big spam which does exist, there is a risk in SA's rule system of the "catastrophic backtracking" problem inherent in all regular expression matching systems. We believe that we have eliminated risky rules in the default ruleset but users can still create risky rules themselves, and limiting the scan size provides some protection. Because this is a limit that users (or distributions) are entirely free to adjust to their own tastes and because the rationales for it ARE NOT obsoleted by technological advancement, I do not believe that we should change the default limit. -- You are receiving this mail because: You are the assignee for the bug.
