On Thu, 22 May 2014, Kai Meyer wrote:

I have a CentOS 6 postfix + dovecot + mysql (for vmail) + spamassassin (user prefs via mysql) server that I've been running for a few years now. It's just a few of my private domains, not a lot of traffic. In the last 6 months, the amount of spam getting through has gone from one or two a week to 30 a day. I had sa-learn setup on imap folders called SPAM and HAM running as root, so I just started tossing emails in there. It seemed like I had groups of emails around 2, 0, -1, and -2 (my threshold to dump to my JUNK folder is 3, and I have spamchk sideline things above 7). I still get legitimate email in the 2-3 range, but I haven't had legitimate email above 3 in a long time. After a bit, the 2s became 3s and the 0s became 1s, but the -1 and -2 spam emails stayed put. I did this habitually for more than a month, and the progress seemed to stop. I googled around a bit and realized that I didn't do a very good job setting up rules, so I added pyzor and razor2, and they seem functional. Spam got better, and it's down to maybe 10 a day, but they still range all the way up to 5.

What really gets me is that if I take an email that scores -2, strip the X-Spam* headers, and run it through spamc by hand (even as the spamd user) just like the spamchk script does, it scores around a 4. I have one here that scores a 4.1 if it comes through the mail, and a 6.6 if I run it manually. What can I do to reconcile these scores? I would like the scores I'm getting from the commandline over the ones I'm getting through postfix, but I don't know the system well enough to know what is causing the difference.

================== Via postfix
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on kai2.gnukai.com
X-Spam-Flag: YES
X-Spam-Level: ****
X-Spam-Status: Yes, score=4.1 required=3.0 tests=BAYES_60,HTML_IMAGE_RATIO_08, HTML_MESSAGE,INVALID_DATE,MIME_HTML_ONLY,RDNS_NONE,SPF_PASS autolearn=no
       version=3.3.1
...
Content analysis details:   (4.1 points, 3.0 required)

pts rule name              description
---- ---------------------- --------------------------------------------------
1.1 INVALID_DATE           Invalid Date: header (not RFC 2822)
-0.0 SPF_PASS               SPF: sender matches SPF record
0.0 HTML_IMAGE_RATIO_08    BODY: HTML has a low ratio of text to image area
1.5 BAYES_60               BODY: Bayes spam probability is 60 to 80%
                           [score: 0.6298]
0.0 HTML_MESSAGE           BODY: HTML included in message
0.7 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts
0.8 RDNS_NONE Delivered to internal network by a host with no rDNS


================ Via commandline (cat test.mail | sudo -u spamd /usr/bin/spamc -u <myemail> > postsa.mail)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on kai2.gnukai.com
X-Spam-Flag: YES
X-Spam-Level: ******
X-Spam-Status: Yes, score=6.6 required=3.0 tests=BAYES_60,HTML_MESSAGE,
INVALID_DATE,MIME_HTML_ONLY,RDNS_NONE,SPF_PASS,URIBL_DBL_SPAM autolearn=no
       version=3.3.1
...
Content analysis details:   (6.6 points, 3.0 required)

pts rule name              description
---- ---------------------- --------------------------------------------------
1.1 INVALID_DATE           Invalid Date: header (not RFC 2822)
-0.0 SPF_PASS               SPF: sender matches SPF record
2.5 URIBL_DBL_SPAM         Contains an URL listed in the DBL blocklist
                           [URIs: fellage.me]
1.5 BAYES_60               BODY: Bayes spam probability is 60 to 80%
                           [score: 0.6299]
0.0 HTML_MESSAGE           BODY: HTML included in message
0.7 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts
0.8 RDNS_NONE Delivered to internal network by a host with no rDNS
[snip..]

The only major difference between those two score sets is the addition of
the URIBL_DBL_SPAM hit in the second one. This ment that by the time you
got around to running that manual check somebody had reported that URL to
the URIBL list and they cataloged it as a spammer URL.

If you had run that manual check at the same time (or soon thereafter)
as the postfix run it probably wouldn't have had that URIBL_DBL_SPAM hit
and thus had the same score.

In that regard, URIBLs are like anti-virus signatures; they don't do much
good on a zero-day attack but catch repeat offenders.
Spammers know that and are registering 10's of thousands (or more) new domain
names each day, using them for a few days and then discarding them.
Good news if you're a registrar (lots of fresh business) bad news if you run
a root DNS server (they're in the multi-million name size) or in the
anti-spam business.

The one thing that might help is to utilize grey-listing in your MTA,
the delaying of unknown mail may give it enough time to become listed
in an URIBL and recognized as spam.

Tough but that's the name of the game these days.


--
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Reply via email to