Re: mass-check + sa-update based reputation system

Warren Togami Jr. Sun, 13 Mar 2011 04:10:21 -0700

On 3/12/2011 2:58 PM, Karsten Bräckelmann wrote:

On Sat, 2011-03-12 at 19:04 -0500, [email protected] wrote:

If we added the last untrusted relay IP to the lines in the mass-check
logs, we could use the data to calculate the percentage of emails from each
IP which is spam vs. ham, and then make SA rules to trigger on varying
percentage ranges.


Is this something you're interested in, or would accept a patch for?


A very strong -1.

SA does not block mail, and we do not run a blacklist of our own. (Thus,
we cannot remove a sender from it, which even is a prominent answer on
the wiki.) The fundamental concept of a scoring system, with no rule's
score above the threshold, effectively means that IF a mail ends up as
spam, this never is due to a single rule, listing, whatever -- but
multiple spammy signs summed up.

Running a blacklist of our own would single-handedly ruin the entire
concept of scoring only, and provide a single point of failure we'd be
responsible for.


Moreover, I'd guess the corpus is NOT sufficiently large, by far -- even
more so with very recent data necessary for an accurate measurement and
re-gaining a good reputation over time.

Someone actually running an IP reputation service probably can chime in
with better numbers, but just as a figure -- MailSpike, who already
offered "some" of their spam for our mass-check (see recent thread), has
more trap data per day than our entire corpus.


Not even to mention that this requires a LOT fine-tuning and special
knowledge...


Darxus,

I very much appreciate your thinking about possible ways of improvingspamassassin, but in this particular case I completely agree withKarsten. Please don't feel discouraged from thinking about andproposing possibly crazy new ideas. I simply accept that most ideas arebad. Every once in a while one of those ideas will stick. (Fedorabegan as a crazy idea proposed on a mailing list.) I particularly wantto hear more ideas of automated data collection/use.

I think better low-hanging-fruit to make something useful with our trapsmight be to submit it in an automated fashion to DNSWL and/or an IPreputation system to help with their automatedclassification/enforcement. Your script to allow my auto-submissionspam that violates DNSWL has been incredibly helpful. A similar scriptto submit to Mailspike might be similarly helpful.

I'd like to follow up on this with you and Mailspike on this and <thatother issue>, but I wont have time until mid-April.


Warren

Re: mass-check + sa-update based reputation system

Reply via email to