On Sat, 2011-03-12 at 19:04 -0500, [email protected] wrote:
> If we added the last untrusted relay IP to the lines in the mass-check
> logs, we could use the data to calculate the percentage of emails from each
> IP which is spam vs. ham, and then make SA rules to trigger on varying
> percentage ranges.
>
> Is this something you're interested in, or would accept a patch for?
A very strong -1.
SA does not block mail, and we do not run a blacklist of our own. (Thus,
we cannot remove a sender from it, which even is a prominent answer on
the wiki.) The fundamental concept of a scoring system, with no rule's
score above the threshold, effectively means that IF a mail ends up as
spam, this never is due to a single rule, listing, whatever -- but
multiple spammy signs summed up.
Running a blacklist of our own would single-handedly ruin the entire
concept of scoring only, and provide a single point of failure we'd be
responsible for.
Moreover, I'd guess the corpus is NOT sufficiently large, by far -- even
more so with very recent data necessary for an accurate measurement and
re-gaining a good reputation over time.
Someone actually running an IP reputation service probably can chime in
with better numbers, but just as a figure -- MailSpike, who already
offered "some" of their spam for our mass-check (see recent thread), has
more trap data per day than our entire corpus.
Not even to mention that this requires a LOT fine-tuning and special
knowledge...
--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}