I talked to Justin and he is not aware of a SurfNet machine.  They are a targeted sponsor who has helped us.  I've sent one email seeing if I can dig anything up but I must concur.

Dave's research shows that "trap-proc.spamassassin.org and spam-upload.spamassassin.org both resolve to 192.87.106.247 which is a SURFnet IP in the Netherlands.  I am not able to ping or nmap any response to that IP."

With that, we are giving up on the mass-check centralized box and shutting down the trap-proc and sought processing.


These are the tasks:

1 - Find out more from SurfNet about the 192.87.106.247 machine [KAM]

2 - [Joe, can you help?] Find out what Sonic is trapping and where they are sending it.

3 - Look at Sought 2.0 [Perhaps with AXB and Dave Jones]

4 - Delete the backup data on sa-vm1, move /usr/local/spamassassin to a different mount so that entire partition can be returned to Infra [DAVE]

5 - Open Infra ticket to return data. [KAM]


So Joe can you do the following:

- Please deprecate the old colo box.  Thank you for hosting and providing it on behalf of the project!

- the ASF would like to recognize Sonic as a sponsor.  Any update on the paperwork?  If paperwork is an issue, I can grandfather you without it.

- How's our bandwidth usage look to Sonic.  Should we bump it up/down?


Dave:

Any chance you can handle these tasks or let me know if they already were done?

- Put an A record for the box into DNS for apachesf.Spamassassin.org

- Add to SysAdmins Docs / wiki the new apachesf and remove colo and note that trap-proc is no longer online.


On 1/16/2018 7:19 PM, Dave Jones wrote:
On 01/16/2018 09:00 AM, Kevin A. McGrail wrote:
Joe and Dave + SASA:

I have combed my notes.  Here's what I have which I have removed some password info on but I think it can help rebuild the process.  Dave, can you take a look?  I can get you passwords. Talon1 still exists, spamassassin-vm does not, I have a backup of spamassassin-vm from 3.7 months ago.

Regards,
KAM

#1 - Some boxes are just names for other boxes
trap-proc.spamassassin.org. Sonic has scripts set up to archive collected spam to that server.



trap-proc.spamassassin.org and spam-upload.spamassassin.org both resolve to 192.87.106.247 which is a SURFnet IP in the Netherlands.  I am not able to ping or nmap any response to that IP.


#2 - My notes from spamassassin-vm.apache.org that catastrophically died:

this was the traps cron that needs to be added on spamassassin-vm

20 2 * * * rsync -rze ssh --whole-file --size-only --delete j...@trap-proc.spamassassin.org <mailto:j...@trap-proc.spamassassin.org>:/home/jm/cor/. /export/home/bbmass/uploadedcorpora/traps/.

DONE - add this traps account
DONE - fix perms for /export/home/bbmass/uploadedcorpora/traps/
DONE - add cron job



Since this IP is not responding to anything, not sure how long this has been offline.  I can't see what purpose this would be be doing based on the masscheck flow that worked on last May on sa-vm1.apache.org.

The old spamassassin-vm.apache.org used to run a buildbot version for on-demand masschecking of uploaded corpus but we don't have that setup again anywhere.  If we do set this up again, I don't want to put this on the sa-vm1.apache.org without any swap space. I don't see the value of uploading ham/spam and running the masscheck centrally anymore.


#3 - From April 2017

Let me know if you are not the correct person to talk to about this, but
we are having issues reaching trap-proc.spamassassin.org. It looks like
we have some scripts set up to archive collected spam to that server,
and I haven't seen a successful connection for a few days now.

--
Grant Keller
System Operations
grant.kel...@sonic.com <mailto:grant.kel...@sonic.com>


#4 - from 2014


The box at Sonic is the backend for the SpamAssassin spamtraps feed.  To be honest, I am not sure anyone or anything is consuming the collected data at this stage -- it should probably be shut down, unless someone wants to take it over?


My vote is to shut it down.


incoming.spamassassin.org : this is the spamtrap machine at Sonic.   Basically, qpsmtpd handles the incoming SMTP traffic, handing it off via a Gearman queue to "gears" -- a set of scripts running in the background which filter out noise, crap, bounces, etc.
then buffer them to mbox files and upload.

/home/trap contains the code, /home/trapper is the output files.   /etc/init.d/gears starts the scripts which compose it, copying them to /tmpfs first so they don't hit the disk where possible,
for speed.

The main config file is at /home/trap/code/gears/config .

The buffered mbox files are then uploaded to my S3 account, using an IAM credential which can only access one single bucket called "mailtrap".  After 1 day those files are auto-expired.

This stuff all appears to be working ok, although the volume is pretty high (and I suspect
it's costing me a fair bit of money even despite the auto-expiration!)


The gears logs show no recent activity.  I think this has stopped being fed data/files a long time ago.


Next step is spamassassin2.zones.apache.org, which has an alias of "trap-proc.spamassassin.org" in DNS.  A cron on my user account runs "/home/trapscripts/copy_to_corpus" which (at least at some point) appears to have selected a randomised subset of uploaded spam corpora into /home/jm/cor/spam and /home/jm/cor/nonspam.  Those directories are now empty, so
I think this part may have broken at some point in 2013 :(

I can't track down the script which downloads files from the S3 account, annoyingly!

Again, everything there runs as "jm".


Again, don't see the value of running a centralized masscheck anymore especially on sa-vm1.apache.org.


Finally there is talon1. The host is talon1.pccc.com; username "jm", password is in spamassassin2.zones.apache.org/root/sought_rules_info.txt (readable only by
root).


I though the SOUGHT rulesets became "unsupported" a long time ago.  If you want me to check out talon1.pccc.com, send me the creds.

That host is being used to generate the SOUGHT rulesets, and as far as I can see (apologies, I haven't been monitoring it at all recently!) it still seems to be doing so. It all runs from the "jm" user account, every 4 hours from cron; see "crontab -l".  Part of the process is to rsync-over-ssh the ham and spam
corpus from jm@spamassassin2.


Then the final step of that script is to publish the files to my server at taint.org, by "svn commit"ing in ~/sought on talon1.  That directory commits back to an svn repo on that host over svn+ssh, then SSHes to that host and runs a script; that generates GPG signatures, updates the DNS records and pushes it to the Cloudfront/S3 bucket for rules.yerp.org. If/when you guys take this over, this bit definitely needs to be moved to another host and account, since that's
my main personal server ;)

Having said that, I'm happy to hand over the credentials to all the other "jm" accounts
named above.  I've put the passwords into
spamassassin2.zones.apache.org/root/sought_rules_info.txt (readable only by root).
Feel free to take over those accounts and do what you will with them ;)

Sorry I haven't handed this over earlier -- even reverse-engineering all this took
quite a lot of effort.  Legacy systems suck!


I definitely agree!

The trap data comes from:   It's partly a typical spamtrapping MX capturing dead domains, and partly /etc/aliases forwards from other ISPs around the world, who are following the "how to donate your spamtrap to SA" instructions on the wiki.  Note that the latter means that we have to do a bunch of stripping off forwarding steps when/if we act on that data.

the domains are  MX records hanging off existing,
live domains; e.g. I'd add a "mx.taint.org", seed a few email addresses in those domains eg for web scrapers, then MX the entire domain to the traps
machine.

 > - the alias forwards: where they pointing to?
Essentially there's a *@incoming.spamassassin.org <mailto:*@incoming.spamassassin.org> catch-all, and the alias
forwards redirect spam into named addresses there.


sought_rules_info.txt:

jm@talon1 password: <removed>

incoming.spamassassin.org = traps machine:  u root   p <removed >
                 u jm     p <removed>

jm account on zones2: <removed>

Doing about 74000 messages per day as of 2/5/2014

DONE - WORKS AS of 4/23 GOING TO 76.191.162.2 1 - Get SSH access working - Pinged Justin on 4/22

DONE - 2 - why does incoming.spamassassin.org have two IPs? - Emailed Justin

incoming.spamassassin.org. 3507 IN      A       76.191.162.2
incoming.spamassassin.org. 3507 IN      A       75.101.166.134

Huh. I had no idea we were still doing that ;)  That is the Mailchannels spamtrap IP.  If you remember back in 2008 (private@ was cc'd), they donated spamtrap hosting to us, in exchange for spam data.  We eventually moved off the donated spamtrap server (in EC2) which they were paying for, to the current one in PCCC. it looks like we never changed the 50:50 split setup though on the MX record (and I'd forgotten about it).  I think we can probably turn that off now….

76.191.162.2 is our one.  I've just verified that I'm able to SSH to it as root.

DONE - 2a - Remove 75.101.166.134 from incoming.spamassassin.org. DNS entry


I think I added the second incoming A record (didn't see the first one) to the new sa-vm1.apache.org just because I found an Apache HTTPD virtual host setup on the old box for incoming but it's not doing anything.

3 - more?
On 1/15/2018 11:49 AM, Dave Jones wrote:

--
Dave


Reply via email to