On 3/12/2011 2:58 PM, Karsten Bräckelmann wrote:
On Sat, 2011-03-12 at 19:04 -0500, [email protected] wrote:
If we added the last untrusted relay IP to the lines in the mass-check
logs, we could use the data to calculate the percentage of emails from each
IP which is spam vs. ham, and then make SA rules to trigger on varying
percentage ranges.
Is this something you're interested in, or would accept a patch for?
A very strong -1.
SA does not block mail, and we do not run a blacklist of our own. (Thus,
we cannot remove a sender from it, which even is a prominent answer on
the wiki.) The fundamental concept of a scoring system, with no rule's
score above the threshold, effectively means that IF a mail ends up as
spam, this never is due to a single rule, listing, whatever -- but
multiple spammy signs summed up.
Running a blacklist of our own would single-handedly ruin the entire
concept of scoring only, and provide a single point of failure we'd be
responsible for.
Moreover, I'd guess the corpus is NOT sufficiently large, by far -- even
more so with very recent data necessary for an accurate measurement and
re-gaining a good reputation over time.
Someone actually running an IP reputation service probably can chime in
with better numbers, but just as a figure -- MailSpike, who already
offered "some" of their spam for our mass-check (see recent thread), has
more trap data per day than our entire corpus.
Not even to mention that this requires a LOT fine-tuning and special
knowledge...
Darxus,
I very much appreciate your thinking about possible ways of improving
spamassassin, but in this particular case I completely agree with
Karsten. Please don't feel discouraged from thinking about and
proposing possibly crazy new ideas. I simply accept that most ideas are
bad. Every once in a while one of those ideas will stick. (Fedora
began as a crazy idea proposed on a mailing list.) I particularly want
to hear more ideas of automated data collection/use.
I think better low-hanging-fruit to make something useful with our traps
might be to submit it in an automated fashion to DNSWL and/or an IP
reputation system to help with their automated
classification/enforcement. Your script to allow my auto-submission
spam that violates DNSWL has been incredibly helpful. A similar script
to submit to Mailspike might be similarly helpful.
I'd like to follow up on this with you and Mailspike on this and <that
other issue>, but I wont have time until mid-April.
Warren