Re: spamd and network whitelisting

2017-01-10 Thread Christopher Zimmermann
On 2016-12-16 Clint Pachl  wrote:

[...]
> What would be
> best is if we could blacklist these spammers upon first connection

I also wanted to just-in-time decisions, but with dnswl lookups.
I wrote a program to intercept incoming, unknown smtp connections and
do a dnswl lookup to whitelist them just in time. You could do the same
for blacklisting, but only for lookups based on ip because the program
looks only at the initial syn packet.
For me this helped a lot to deliver mails faster which would otherwise
be delayed in the greytrap, or even get stuck, because they come from
smtp pools.


here are the pf rules:
pass in on egress inet proto tcp to (self) port smtp flags S/SA no state
divert-packet port 25
pass in on egress inet proto tcp from  to (self) port smtp keep
state rdr-to 127.0.0.1 port spamd
pass in log (to pflog1) on egress proto tcp from { }
to port smtp keep state

and here's the C program. It still has lots of dead debugging code.:

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 


#define DEBUG 0

#define DIVERT_PORT 25

#define NSTATES 10

struct dns_header {
uint16_tid;
uint16_tflags;
#define QR 0x8000
#define OPCODE_MASK 0x7800
#define OPCODE_SHIFT 11
#define AA 0x0400
#define TC 0x0200
#define RD 0x0100
#define RA 0x0080
#define AD 0x0020
#define CD 0x0010
#define RCODE_MASK 0x000f
#define RCODE_SHIFT 0
uint16_tqdcount;
uint16_tancount;
uint16_tnscount;
uint16_tarcount;
};

struct dns_record {
uint16_ttype;
uint16_tclass;
uint32_tttl;
uint16_tlength;
};

struct state {
union {
struct in_addr in4;
struct in6_addr in6;
uint8_t octets[sizeof(struct in6_addr)];
} addr;
struct timespec timeout;
int af;
uint16_t dnskey;
} states[NSTATES];

void send_query(struct state *state, const char *question);
void process_response();

void enlist(struct state *state, int white);

int dnssock, pfdev;

const char *const whitelists[] = {
"list.dnswl.org",
"swl.spamhaus.org",
};

int main(int argc, char *argv[])
{
int i, ret;
time_t t;
struct sockaddr_in sin4;
struct sockaddr_in6 sin6;
struct group *group;
struct passwd *passwd;
struct pollfd fds[3];

tzset();

pfdev = open("/dev/pf", O_RDWR);
if (pfdev == -1) err(1, "open(\"/dev/pf\") failed");

ret = IPPROTO_DIVERT_INIT;
setsockopt(fds[1].fd, IPPROTO_IP, IP_DIVERTFL, , sizeof(ret));
setsockopt(fds[2].fd, IPPROTO_IPV6, IP_DIVERTFL, , sizeof(ret));

/* DNS */
if (res_init() == -1) err(1, "res_init");
assert(_res_ext.nsaddr_list[0].ss_family != 0);
fds[0].fd = dnssock = socket(_res_ext.nsaddr_list[0].ss_family,
   SOCK_DGRAM | SOCK_DNS, 0);
if (fds[0].fd == -1) err(1, "socket");

if (connect(fds[0].fd, (struct sockaddr *)&_res_ext.nsaddr_list[0],
_res_ext.nsaddr_list[0].ss_len) != 0)
err(1, "connect");

/* IPv4 divert */
memset(, 0, sizeof(sin4));
sin4.sin_family = AF_INET;
sin4.sin_port = htons(DIVERT_PORT);
sin4.sin_addr.s_addr = INADDR_ANY;
fds[1].fd = socket(AF_INET, SOCK_RAW, IPPROTO_DIVERT);
if (fds[1].fd == -1) err(1, "socket");
if (bind(fds[1].fd, (struct sockaddr *) , sizeof(sin4)) != 0)
err(1, "bind");

/* IPv6 divert */
memset(, 0, sizeof(sin6));
sin6.sin6_family = AF_INET6;
sin6.sin6_port = htons(DIVERT_PORT);
sin6.sin6_addr = in6addr_any;
fds[2].fd = socket(AF_INET6, SOCK_RAW, IPPROTO_DIVERT);
if (fds[2].fd == -1) err(1, "socket");
if (bind(fds[2].fd, (struct sockaddr *) , sizeof(sin6)) != 0)
err(1, "bind");

group = getgrnam("_spamd");
if (group == NULL) err(1, "getgrnam");
endgrent();
passwd = getpwnam("_spamd");
if (passwd == NULL) err(1, "getpwnam");
if (chroot("/var/empty") != 0) err(1, "chroot");
if (setgroups(0, NULL) != 0) err(1, "setgroups");
if (setgid(group->gr_gid) != 0) err(1, "setgid");
if (setuid(passwd->pw_uid) != 0) err(1, "setuid");

fds[0].events = POLLIN;
fds[1].events = POLLIN;
fds[2].events = POLLIN;

#if 0
states[0].af = AF_INET;
clock_gettime(CLOCK_MONOTONIC, [0].timeout);
states[0].timeout.tv_sec++;
states[0].addr.in4.s_addr = inet_addr("217.72.192.73");
fds[0].events |= POLLOUT;
#endif

while (1) {
char src[48], dst[48];
struct timespec timestamp;

#if DEBUG
for (i=0; i < 3; i++)
fprintf(stderr, "%d: fd:%d events:%hd revents:%hd\n",
i, fds[i].fd, fds[i].events, fds[i].revents);
fprintf(stderr, "Polling");
#endif
ret = -1;
for (i=0; i < NSTATES; i++)
if (states[i].af != 0 &&
(ret == -1 ||

Re: spamd and network whitelisting

2017-01-09 Thread Boudewijn Dijkstra
Op Tue, 20 Dec 2016 12:31:05 +0100 schreef Clint Pachl  
:

[...]
grep "^GREY" |
tr "|" "\t" |
[...]


I've learned to do all parsing of /var/db/spamd via the  interface  
as the envelope-from sometimes contains a "|" (pipe) character.



--
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/



Re: spamd and network whitelisting

2016-12-21 Thread Boudewijn Dijkstra
Op Tue, 20 Dec 2016 12:51:19 +0100 schreef Clint Pachl  
:

Devin Reade wrote on 12/19/16 12:59:

With respect to dealing with SPF, the simple solution (permitting an
IP if it is on the sending domain's SPF list) doesn't work too well
in the general case since it appears many spammers publish SPF records.


You're right. When I ran ruby-spf against the the TRAPPED IPs in my  
spamdb, a surprising number passed SPF (like 15%). On the other hand,  
one of the popular email domains from our customer DB is @att.net, which  
doesn't even publish SPF. After some real life testing against our  
client email DB, I determined SPF was not effective in filtering spam  
for us. If it is used, it should be a small factor at best.


SPF was never meant for making accept/reject decisions on arbitrary  
domains.  If you don't trust the sending domain, then SPF evaluation is  
pointless.



--
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/



Re: spamd and network whitelisting

2016-12-20 Thread Craig Skinner
Hello Clint,

On Fri, 16 Dec 2016 07:21:47 -0700 Clint Pachl wrote:
> I would like to share my 45-day experience with running spamd and my 
> observations and how I'm allowing mail from SMTP clusters to bypass 
> spamd. Feedback and discussion would be greatly appreciated.
> 

spamd in greylisting mode is indeed truly awesome!

With over 10 years real world experience running this way,
with several domains, I've tried a lot of ideas & scripts too...

The original design is very good and doesn't need much assistance.

To solve the clustered round robin senders (Gmail, etc.) simply bump
the -G:greyexp: time from 4 hours to 4+ days - 100 hours is good.
Job done! No scripts needed.

When configured like this, most gmails come through in around 6 hours
to 1.5 days, with some a bit longer. The more inbound gmails, the
shorter the delay, down to a few minutes as volume increases.
Same for Outlook, Amazon, (which are both worse than Gmail) etc,

Bumping the -G :whiteexp time to 40 days helps a bit too.


Aggressive stuttering and a shrunk window foils almost all zombies.

Add in a fake highlisting -M to the mix, and it is game over for the
zombies, which love to target a backup MX box, so give them a trap.
(This needs a constantly deferring MTA on that IP address too.)

spamd_flags='-G 25:100:960 -S 90 -s 5 -w 1 -M  -y  -Y ... -Y ... -Y ...'
spamlogd_flags='-I -W 960 -Y ... -Y ... -Y ...'

(AOL only retries for 25 minutes (not the RFC 4 days), so if you
want to receive from AOL, the -G passtime: needs to be ~10 minutes.)


Some pf rate limiting kills off those zombies that understand the 'try
again later' SMTP code, then start hammering the server all at once:

The 2nd rule blocks (after almost 2 days) badly setup M$ Extrange
servers, which retry every minute


set block-policy drop

# Normal & highlisting Internet inbound operation via spamd:
pass in on $ext_if inet proto tcp \
from any port > 1023 \
to {$ext_if:0, $ext_if:2} port smtp \
divert-to localhost port spamd \
keep state \
(max-src-conn 30, max-src-conn-rate 50/9, \
overload  flush global)

pass in log on $ext_if inet proto tcp \
from  port > 1023 \
to {$ext_if:0, $ext_if:2} port smtp \
user root \
modulate state \
(max-src-conn 80, max-src-conn-rate 150/15000, \
overload  flush global)


block in log from 


EASY! SIMPLE! Nothing to break.

No special domain lookups or exception lists. No maintenance labour.





Bob's other tool I deployed for many years was his greyscanner (in
ports). Over the years, I modified this to do aggregate DNS black &
white listing too. When I realised that it was very rare for spam to
pass the extended stuttering, I stopped running greyscanner.




Reverting to the default -G flags (4 hours grey expire), and help
promote round robin senders faster from grey to white, I wrote this
simple script. It runs unprivileged once every 4 hours from cron.

No pf tables/lists, no doas/sudo rules. No SPF checks.

It operates on an fgrep pattern of spamd HELO hostnames, as Gmail,
Outlook, etc. relay for many domains, but HELO from Google/Outlook.

The decision to upgrade from grey to whitelisted status is based on
an accumulated sliding score of multiple DNS list lookups.

See http://web.Britvault.Co.UK/products/ungrey-robins/ & logs there.




Also try Boudewijn's patch (see his continued blocking graph):
https://github.com/bdijkstra82/OpenBSD-spamlogd


> 
> Thanks to all the developers who made spamd; an amazing, simple,
> clever tool.
> 

Aye!
-- 
Craig Skinner | http://linkd.in/yGqkv7



Re: spamd and network whitelisting

2016-12-20 Thread Clint Pachl

Devin Reade wrote on 12/19/16 12:59:

You might also want to look at bgp-spamd.


Yes, this was on my radar for quite some time. However, my simple spamd 
setup with assistance from the zen.spamhaus.org DNSBL has been extremely 
effective. It's nice to know we've got more big guns if needed.




With respect to dealing with SPF, the simple solution (permitting an
IP if it is on the sending domain's SPF list) doesn't work too well
in the general case since it appears many spammers publish SPF records.


You're right. When I ran ruby-spf against the the TRAPPED IPs in my 
spamdb, a surprising number passed SPF (like 15%). On the other hand, 
one of the popular email domains from our customer DB is @att.net, which 
doesn't even publish SPF. After some real life testing against our 
client email DB, I determined SPF was not effective in filtering spam 
for us. If it is used, it should be a small factor at best.




Re: spamd and network whitelisting

2016-12-20 Thread Clint Pachl
Some have requested my scripts and configurations so here it is. Below 
you fill find the spamd-dnsbl and spamclusterd scripts that are used for 
blacklisting spammers and whitelisting networks, respectively. Also 
included is dnsbl-check which I use for testing IPs against multiple DNSBLs.


In the crontab below, you will see that I archive the spamdb daily and 
save some stats mainly for post analysis. For instance, my initial spam 
fighting technique many years ago (prior to enabling spamd actually) was 
to block the IP networks (20,000+ IPv4 networks) of the countries in 
which we received the most spam, yet weren't expecting legitimate email 
from (i.e. China, Russia, India, Brazil, etc.). I still had this enabled 
up until 2016-12-17. So I make notes of changes like this to see the 
positive or negative effects and I have the spamdb archives to assist 
the analysis. Changing spamd_flags is something else I document.


A side note: Years ago, blocking spamming countries, for me here in the 
US, essentially got rid of my spam problem, but has become ineffective 
as many spammers are sending from US networks now, thus spamd. It has 
only been three days since I disabled spam country blocking, but I have 
received exactly 2 emails that have made it pass spamd, which would have 
otherwise been blocked by the country IP block. Not bad, but we'll see 
what the stats look like in a couple of weeks. However, I can guarantee 
that the number of trapped entries in my spamdb will increase. I 
originally created my pf table of spamming countries from 
http://www.ipdeny.com/ipblocks/data/countries/


One of the other tests, which had significant impact, was using 
spamd.alloweddomains. I tried a few things, but settled on my current 
setup: for one email domain I list just the domain part (e.g. 
@domain1.com), but for the other domain, which has limited users, I list 
the full email addresses of all current accounts (e.g. 
us...@domain2.com, us...@domain2.com, ...). This increased my TRAPPED 
entries by 30%. These additional TRAPPED IPs were mainly one-shot 
spammers, so it was nice to tarpit them while I had the chance. So far 
spamd has been very effective so I haven't defined and published any 
SPAMTRAP addresses, but this is just another knob I can turn on and 
measure if needed.


To assist with spam management without root privileges, I added the spam 
administrator to the _spamd group, gave r/w group privileges on 
/var/db/spamd, and added a few pfctl commands to the doas.conf.


Overall I am ecstatic about spamd and its integration with pf, as well 
as the simple spamdb interface (with the help of grep(1), cut(1), 
sort(1), wc(1), column(1), sed(1), etc.). It is an extremely flexible 
and powerful toolset. Hopefully my experience and scripts are helpful to 
other spam fighters. I think you can look to other projects, like 
spamassassin for example, to get ideas of spam fighting techniques which 
can be implemented at a lower level using pf and spamd. For example, a 
set of factors could determine a spam "score" similar to spamassassin: 
if an IP is on multiple DNSBLs (each list weighted by quality), the DNS 
PTR doesn't correspond to the HELO, and it fails SPF, then it is 
probably safe to blacklist. The bgp-spamd.net project is another tool 
that could be added to the mix. You will have to balance complexity and 
effectiveness, but I would encourage simplicity and minimal resource usage.


Again, hats off to all the developers.


=== spamclusterd ===

#!/bin/sh
#
# Whitelist an SMTP cluster network.
#
# NOTE: pipe spamdb(8) or an archive to stdin.

extract_helo_tld() { echo "$1" | sed -En 's/.*[[:<:]]([^.]+\.[^.]+)$/\1/p'; }
extract_ip_net() { echo "${1%.*}"; }

print_ip_net_with_mask() {
echo "$(extract_ip_net $1).0/24"
}

helo_tld_match()
{
tld1=$(extract_helo_tld "$1")
tld2=$(extract_helo_tld "$2")
[[ -n $tld1 && $tld1 = $tld2 ]]
}

ip_net_match()
{
net1=$(extract_ip_net $1)
net2=$(extract_ip_net $2)
[[ $net1 = $net2 ]]
}

_ip=""
_helo=""
_from=""
_to=""
is_cluster=0

grep "^GREY" |
tr "|" "\t" |
cut -f2-5 |
sort -k3,4 -k2 -k1 |
while read ip helo from to
do
if [[ $to = $_to && $from = $_from ]] &&
   helo_tld_match "$helo" "$_helo" &&
   ip_net_match "$ip" "$_ip"
then
is_cluster=1
elif [[ $is_cluster = 1 ]]
then
is_cluster=0
print_ip_net_with_mask $_ip
fi

_ip="$ip"
_helo="$helo"
_from="$from"
_to="$to"

done




=== spamd-dnsbl ===

#!/bin/sh
#
# Query DNSBL using the IPs in spamdb(8). If an IP is on a black list, add it
# as a TRAPPED entry in the spamdb.
#
# It seems most spammers send once and go away. The 1 minute pass time is
# effective at stopping most of these spammers. The other spammers seem to
# resend 10 minutes to more than an hour later, so a longer pass time won't
# defend against such spammers. 

Re: spamd and network whitelisting

2016-12-19 Thread Devin Reade

You might also want to look at bgp-spamd.

With respect to dealing with SPF, the simple solution (permitting an
IP if it is on the sending domain's SPF list) doesn't work too well
in the general case since it appears many spammers publish SPF records.

However what I found works well, at least for some low-volume domains,
is to identify the subset of domains for which I would like to honour
the SPF records and automatically whitelist them.

I wrote a little perl script, available as:
  
The script takes a set of whitelisted domains and queries the DNS to
build up the matching set of whitelisted IPs.  It then puts these into
a file that can be loaded as a pf table.  This permits pf to bypass
spamd for these whitelisted domains.  There is extra usage information
(and a description of current limitations) in comments at the top of
the script.

This does require one to reload the pf configuration, however (due to
paranoia) the current version of the script doesn't do that. Instead,
it mails root if something has changed that would require the
configuration to be updated.  Experience shows that this doesn't trip
very often.

I invoke the script from daily.local as something like:

  /usr/local/sbin/gen-spf-whitelist \
  example.com \
  example.tld \
  something.else.net \
  (...)

I qualified the above by mentioning I was using it on some low-volume
domains because the current mechanism probably doesn't scale well
with respect to maintaining the list of domains.  It could probably
benefit from a couple of substantive changes:

- permit the whitelisted IPs to be updated without needing to have pf reload
 it's rules.  This implies updating the pf table directly, in a manner
 similar to what is used for bgp-spamd.

- be able to tie in with a client management system that permits users
 to request domains to be whitelisted (only SPF-publishing domains could
 be whitelisted this way using this mechanism).

Potential candidate domains for inclusion will be obvious.  If you
'grep GREY /var/log/daemon', the most likely potential candidates are
those where you will see multiple delivery attempts from the same domain
to the same recipient but where the originating IPs differ (although
likely in the same net block).

Devin



spamd and network whitelisting

2016-12-16 Thread Clint Pachl
I would like to share my 45-day experience with running spamd and my 
observations and how I'm allowing mail from SMTP clusters to bypass 
spamd. Feedback and discussion would be greatly appreciated.


I have two domains that I have been using for my businesses: one is 13 
years old and the other is 8 years old. I have never had a spam problem 
until about six months ago. In October I was getting about 100-200 spams 
per day per domain. The spam rate was increasing from month to month. 
All mail was going directly to my OpenSMTPd. I was not using filtering 
of any kind so the signal-to-noise was very low, and frustrating.


So I read the spamd and related man pages and enabled spamd on my 
firewall on November 1. I was astonished! I literally got 6 spam emails 
that first week for both domains!


However, the big problem was, I also wasn't getting legitimate business 
emails that were sent from SMTP clusters/pools. After studying my logs, 
tweaking spamd(8) flags, looking to external solutions (DNSBL, SPF, 
reverse IP verification), I had some observations and discovered some 
patterns. Here's the solution I'd like to share:


I wrote two very small scripts: spamd-dnsbl and spamclusterd. These 
scripts work together to keep spam to a minimum while passing all 
legitimate email (in my case so far).


1) spamd-dnsbl: Queries a DNSBL using the IPs in spamdb(8). If an IP is 
on a black list it is added as a TRAPPED entry in the spamdb. The script 
only checks IPs which have been added since last run. Currently, only 
the zen.spamhaus.org DNSBL is queried because I found it to be the most 
true of all those listed at 
http://en.wikipedia.org/wiki/Comparison_of_DNS_blacklists. 
Alternatively, multiple DNSBLs could be queried and the results could be 
used in aggregate to determine spam status, thus promoted to TRAPPED.


2) spamclusterd: Queries spamdb(8) for networks to whitelist, which it 
adds to a pf table that bypasses spamd. So before this script gets 
carried away allowing IP blocks to bypass spamd, the spamdb(8) is first 
pruned of spammers using the spamd-dnsbl script.


I've only been running this setup for about 30 days, but I haven't 
missed an email yet; plus spam is still about 1 per day across both 
domains. I receive emails from all the common SMTP clusters, such as 
Gmail, Microsoft (hotmail.com, outlook.com, msn.com, etc.), and Yahoo 
but also US government agencies such as, mail.mil, usmc.mil, uscg.mil, 
irs.gov, etc.


I noticed a pattern of commonalities of these legitimate sending clusters:

1. The envelope's from and to addresses are identical across tuples.

2. The HELOs are very similar, with the TLD from each tuple almost 
certainly the same.


3. They make multiple attempts from different IP addresses, however, the 
IPs differ only by a few bits. (Caveat: I'm only using IPv4)


These 3 points are the basis of spamclusterd. How it works is, if two or 
more GREY tuples with matching "to" and "from" addresses, HELOs with 
matching TLDs, and IPs with matching network bits (/24), then add the 
/24 network to the spamd-cluster table in pf, which bypasses spamd.


I was going to get fancy and do an SPF lookup and try to determine the 
exact network to whitelist, but simply whitelisting a 256 IP block seems 
good enough. Once in awhile the subsequent client IP will be outside 
this block, but the /24 seems to work better than 90% of the time.


Currently, just two client IPs from the same /24 network is enough to 
get that network whitelisted, which seems like a low bar. However, with 
the prior DNSBL pruning, this seems sufficient for now.


## Some other observations ##

Spammers, even if sending from the same IP or IP network and regardless 
of the
TO address, tend to randomize the FROM and/or HELO. Therefore, in the 
case of my spamclusterd script, whitelisting a spammer is less likely 
when ensuring both HELO and FROM match for multiple tuples. These IPs 
will then continue to deal with spamd, and it's business as usual.


I initially tried setting 1 minute passtime and 12 hour greyexp times 
for spamd (i.e. -G 1:12:864) in hopes to eventually whitelist a client 
IP, originating from a cluster, that has reattempted within that large 
window. However, in my first week, I missed a couple of Gmails which 
resent for 5+ days and ultimately failed to deliver. What was 
interesting was one of the Google server IPs retried after 12 hours and 
3 minutes, just missing the grey window, while others retried after 24 
hours. I now set -G 1:10:1080.


It seems safe to assume a spammer if reverse IP lookup returns NXDOMAIN 
and IP
is on at least 1 reputable DNSBL or lookup returns SERVFAIL after two 
attempts.


Using SPF seems unreliable as of 11/22/16. Tested SPF on hundreds of IPs 
in spamdb using the ruby spf gem. More than half the IPs did not specify 
SPF or it failed in some

way.

If the envelope's "from" is our domain (i.e., to and from addresses are 
the same domain), it is definitely a