TL;DR: As of May 17, @gentoo.org will drop incoming spammy mail instead of
delivering it. Speak now or hold your peace.
Hi all,
As past long-standing practice, @Gentoo.org system-level mail handling for
incoming mail was officially to tag everything, and delete nothing.
All deletion decisions were left to developers, via procmail/sieve/etc.
This was a good early policy, as Gentoo was a much more reliable host than
email providers a decade ago. This isn't true anymore, with the meteoric rise
and success of gmail.
A LOT of developers forward their mail now, to systems that refuse/temporarily
blacklist the forwarding system because there is a lot of spam. Gmail is
particularly strict in this regard, throttling mail to any recipient from the
forwarding source.
This is particularly acute, because more than 40% of the outgoing mail goes to
Google (the 25% of destinations below is heavily represented because the very
active devs send their mail to google).
This unfortunate combination means that ~40% of mail sits in a backlog for a
long time, and the active devs that use Gmail don't get their mail in a timely
fashion.
Unless there are any major objections, as of May 17th, Infra will start
dropping mail that scores more than 10.0 points in Spamassassin.
If that is successful, I propose to drop the score point by 1 point every month
until it hits a score of 5.0 (so by mid-October, it will be dropping mail that
scores more than 5.0).
Stats on how mail is handled:
-----------------------------
~260 active devs
~180 .forward files
This breaks down to:
~70 procmail users
~10 sieve users
2 users with both forward and procmail
1 maildrop user
~100 devs that send mail outside of @gentoo.org (in their .forward)
I didn't analyze the procmail/sieve/maildrop accounts further.
I did break down the other forwarding destinations by domain:
~50 devs that forward directly to @gmail or @googlemail addresses
~10 devs that have their own domain hosted at gmail/googlemail
~40 devs with some other provider.
0 devs with yahoo, hotmail or msn domains as destinations :-).
As a result, about 25% of dev mail destinations are actually Google.
Amavis stats:
-------------
Here are the amavis summary stats for @gentoo.org incoming mail that was
scanned for content (this happens before exploding to aliases and multiple
recipients, so is a lot lower than you might otherwise expect).
"SPAMMY" in this case is >= 5.5.
26 May 3 Blocked INFECTED
1609 May 3 Passed CLEAN
1564 May 3 Passed SPAMMY
35 May 4 Blocked INFECTED
4129 May 4 Passed CLEAN
2304 May 4 Passed SPAMMY
2 May 4 Passed UNCHECKED
42 May 5 Blocked INFECTED
4458 May 5 Passed CLEAN
3183 May 5 Passed SPAMMY
4 May 5 Passed UNCHECKED
43 May 6 Blocked INFECTED
10 May 6 Blocked MTA-BLOCKED
5027 May 6 Passed CLEAN
3443 May 6 Passed SPAMMY
47 May 7 Blocked INFECTED
2 May 7 Blocked MTA-BLOCKED
4657 May 7 Passed CLEAN
3119 May 7 Passed SPAMMY
2 May 7 Passed UNCHECKED
35 May 8 Blocked INFECTED
5025 May 8 Passed CLEAN
2936 May 8 Passed SPAMMY
21 May 9 Blocked INFECTED
2497 May 9 Passed CLEAN
1765 May 9 Passed SPAMMY
16 May 10 Blocked INFECTED
2059 May 10 Passed CLEAN
2033 May 10 Passed SPAMMY
Score analysis of 1 week of incoming mail to amavis:
----------------------------------------------------
~51k unique mails were scored, with a rough breakdown as follows:
~17k < 0.0
~13k 0.0 - 5.0
~7k 5.0 - 10.0
~5k 10.0 - 20.0
~5k 20.0 - 30.0
~3k > 30.0
--
Robin Hugh Johnson
Gentoo Linux: Developer, Infrastructure Lead
E-Mail : [email protected]
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85