[squid-users] Squid & Spamassassin

Chijioke Kalu Mon, 09 Jun 2003 22:08:05 -0700

Hi Daune, Henrik & Adrian,

I figured I write to you guys first, cause your on the developers list and may be able to point me in the right direction if possible assist in the project.

I attached a letter written to Spam Assassin developers list, pls read it to first understand the problem. The second letter, a reply from a developer of spamassassin is what prompted me to write to you, ave just cut out the part which makes reference to squid.

Basically, its how to modify the squid proxyto filter http POST request to spamassassin, which will then filter it for spam contents.

Hope to hear from any of you soon

Attached Text:
----------------
Chijioke Kalu said:

That Subject is what I hope to achieve by joining the SpamAssassin Developers Forum, if possible Contribute to via programming of this aspect of Spam Assassin.

I am a Nigerian Developer/System Administrator, and as many know, high volume of spammails originate from this country, its difficult to eliminate by manual means and always bring reprimands from the US government on the Nigerian ISPs and in general makes sys admin life a living hell.

The Problem:

This spam mails are sent via http (web mail traffic), thus cant be filtered even if you dont have a mail server running, Secondly, mass mailing programs, contributed in proxying, thus avoiding smtp ports that are blocked and deliver their payload.

My Question:

Can spammassassin be tuned to filter such traffic before it even leaves the gateway of the source network?

Yes, this should be possible!

What needs to be done is to make a HTTP proxy server, which can detect a
HTTP POST form submission that looks like a mail message.  This should be
quite easy, since mail messages will have at least 1 CGI parameter that is
quite long, over 2Kb or so, e.g.

   POST http://somewebmailservice.com/cgi-bin/submit.cgi HTTP/1.0
   HTTP headers...
   ....

   [EMAIL PROTECTED]&[EMAIL PROTECTED]&body=The%20encoded%20text%20of%20
   the%20mail%20message%20lots%20of%20text%20here....

Note also that a mail "body" must contain:

 - lots of %20 space characters
 - several %0a line-feed characters for newlines

if such a CGI parameter is found, the proxy then has to create a "fake"
mail message using that as the text, and pass it to SpamAssassin
somehow.  Using the "spamc" client is a low-overhead way to do this.

Given the resulting score from "spamc", it can figure out if there is
a likelihood that the body text looks like spam.

If it does, then the proxy server should return a 4xx HTTP error code,
with an explanatory message in the text, indicating that it was
filtered as possible spam.

Training SpamAssassin's Bayesian learner with a lot of examples of 419
scam mails, will also help a lot in gaining accuracy.

Then, once this is working, what you need is a "transparent HTTP proxy"
which can be used to ensure all HTTP traffic from the internet cafes pass
through this proxy.   In other words, a user opens a connection on
port 80 to a website, and the router transparently connects the TCP/IP
traffic to the proxy server, which proxies the HTTP traffic.

http://www.tldp.org/HOWTO/mini/TransparentProxy.html  gives some info
on a way to do this with Squid and Linux.

Given this, I would suggest a good way to implement this would be to:

 - write a patch in C for the Squid proxy server, which looks for HTTP
   POST form submissions using long CGI parameters that may be the
   body of a mail message
 - if one is found, it creates a "mail message" using that long text
   parameter, and runs "spamc"
 - if "spamc" says it may be spam, refuse the HTTP request
 - otherwise let it continue


Regarding blocking proxy abuse; this is not so hard.  Here's how to do it.
Institute a policy of filtering outgoing IP traffic at your routers, and
block *outgoing* access to the following TCP ports:

1080, 8080, 3128, 8081, 8001, 8000, 10080

These are common ports used by proxies, which are almost never used for
other services.  (in the old days, port 8001 and 8000 were occasionally
used for websites, but I haven't seen one of these in about 4 years ;)

Since proxies are commonly used *inside* a private network to share a
connection, but are virtually never used *outside* a private net for
legitimate purposes, this should not have any serious side effects for
"normal" internet use.

Good luck -- this is an interesting idea, and could have major effects
on the 419'ers.   Any other help you may need, feel free to contact
this group and we'll be happy to help!

--j.

----------------

_________________________________________________________________ Protect your PC - get McAfee.com VirusScan Online http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

Nigerian 419 Spam/Scam Mails
============================

Problem: ------- Nigerian 419 Spam/Scam Mails, have shamefully and unfortunately earned the mockery and antipathy of everyone who has an email address all over the world, not to talk of system administrators who constantly have to keep there spamassassin rule logic up to date. As is so published, it affects/costs millions of dollars for the US government to try and curtail it, not to mention the untold ISP/CyberCafe businesses it pulls under due to its share difficulty of handling, which most times requires manual monitoring, costing both in human and technology resources, not to mention the embarassements faced by both cafe operator and web users, to find their emails letters are being looked at just to confirm if its scam or not, talk of privacy?

Method Used by 419 Spams: ------------------------ The process of 419 spams are well documented, it always requires the second party to be a greedy individual or a stupid one, that is the only reason the business flourishes or seems to flourish. Usually there are 2 methods used to carry out the spamming, First Method is very laborious and requires email addresses to be copied using search engine on particular criterias, and then individual letters typed and sent to each respective addressee, the Second Method, is more effective and the most dangerous, Mass Mailing Programs and Email Spiders are used to send up to 1000 email addresses at once after being extracted by Email Spider bots the go thru the web lookin for its target (email addresses).

Tools Used: ---------- Rolling Launcher, Beijing Email Extractor, Advanced Email Extractor, Email Spyder Easy, Email Blaster, ICQ Search Engines, Email Launcher. (So far this are the ones I have experienced)

Goal: ---- Using the present most advance, efficient, open-source Spam filter, SpamAssassin (TM), applying reverse logic to filter the mails before it leaves via http web traffic (this time around),(this may involve some form of packet filtering) the advantage of this will greatly reduce the overall ability of spam mails, and make life easy for both system administrators at both ends of the problem.


Implementation/Platform:
-----------------------
        SpamAssassin on GNU Linux Operating Systems

Solution: -------- No solution yet, right now the method being used is affecting legitimate clients, it involves blocking the URL's of WebMails used for sending the Spam/Scam Mails and applying a patch to Windows Machines which does not allow the Spam Tools to run, the more drastic approach carried out by Satellite companies and bigger ISPs is to block the IP addresses with recommendation from SpamCop. Talk of Pink Letters!

Others Stuff I wanna just comment on
------------------------------------

SpamCop: ------- The greatest Problem for Nigerian ISP's and CyberCafe's, there reporting creates a non-tolerant attitude by Satellite providers whose approach is the continuous blocking of IP address and forcing the companies to continously purchase IP blocks from them (a good business I may add)

Nigerian Government: ------------------- Is it there fault?, this is a moot point, for one, the labelled 3rd world country is unable to find the necessary resources to help provide better life for its citizens, thus if its to chose the lesser of 2 evils, the citizens (419 operators) might as well attempt to see how smart they are/or stupid!

419, Scam or Greed: ------------------ Is it really a Scam?, As the saying goes, it takes 2 to tango. There must be the other one, usually and always the greedy one, who is ready to exploit the rescources of another country in the attempt to aggrandize one self.

US Government & its Citizens: ---------------------------- The US Government,aargh, what can I say, simply put, uuuhhh, there is not much that can be said, I'll simply repeat what a pal I met said, "Its our responsibility to Protect the Stupid Citizens in Our Country". I think that says it all.

[squid-users] Squid & Spamassassin

Reply via email to