On 02/09/2011 02:41 PM, Steve Freegard wrote:
On 09/02/11 23:21, Warren Togami Jr. wrote:
Most of these are redundant to rules in my sandbox. Please remove the
redundant parts and I guess put the missing rules into my existing sem
file.
Are you sure this is a bug and not because you forgot to assign a score
to them? IIRC network tests like these are skipped if they do not have a
positive score.
I figured it out. The bug workaround has nothing to do with having a
score or not. You are working around the issue with your use of
#testrules instead of tflags nopublish.
I think I'll therefore leave these in my sandbox for now and see what
happens; the names are different and yours aren't being run anyway so I
can't see any harm.
Consolidate into single sandbox file
====================================
My SEM rules were in nightly masscheck since late 2009, and the URI
rules were previously working before Bug #6527 happened.
RCVD_IN_SEMBLACK is also the official rule name (from SEM's own
website). In cases of these network rules, precedent seems to be using
the official names.
Let us consolidate the rules into a single file? Let's use your file.
It doesn't matter where they are in the sandbox. I'll delete my file
and lets rename your rules.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6220
Please note anything related to SEM here.
Reuse Changes
=============
"reuse" is only ideal for the FRESH lists. We will have almost no
readings at all if we use "reuse" for SEMBLACK, URI and URIRED.
Is this wise?
=============
Did you ask Blaine if he approves of this?
It is required for the list operators to ask for or approve spamassassin
testing of their lists. We haven't been testing his URIBL's for a long
time, and if we suddenly begin testing MULTIPLE of his lists we'll
suddenly whack his DNS servers with millions of lookups on Saturday.
The nightly masscheck currently has ~820k mail. Many of those mail have
multiple domain names. Multiply that by his different URIBL's, and that
is a significant flood of DNS lookups coming out of nowhere.
SURBL's multiple URIBL's avoids this multiplication issue because they
are all on a single DNS lookup with different return codes.
SEMBLACK should be avoided
==========================
http://www.spamtips.org/2011/01/dnsbl-safety-report-1232011.html
http://ruleqa.spamassassin.org/20110205-r1067413-n/T_RCVD_IN_SEMBLACK/detail
I strongly recommend folks to not use RCVD_IN_SEMBLACK because of a
questionable record on safety during the past years and overlap of ~90%
with the high scoring Spamhaus RCVD_IN_PBL. (Also old news: Late 2009 I
caught him outright copying PSBL, which he claimed was an innocent
mistake. I don't know what methodology he uses now.)
While its safety improved in recent weeks, this high level of overlap
with PBL makes it dangerous and redundant. Also look at "set 0,
score-map", almost none of the spam hits 5 points and below are
SEMBLACK. This means using SEMBLACK almost never helps you.
Warren Togami
[email protected]