Warren Togami wrote:

http://ruleqa.spamassassin.org/20091006-r822170-n/T_CN_URL/detail
A very sizeable amount of spam (currently 50%) contains .cn domains that were registered very recently. They keep registering new domains in order to keep ahead of the URIBL's.

I have an account here that gets a lot of spam. There have been 263 unique .cn domain names contained within urls in spam message bodies of that account today. All but 94 of them were listed in uribl or surbl.

If I do http requests on http://thedomain/ for each of those domains, every single one of the pages returned for all of those domains matches one of the following two regexes:

<link [^>]*href="/themes/express/img/pharmacyexpress\.ico" [^>]*>
<title>Prestige Replicas : Luxury at affordable prices!</title>

I wrote a module a while ago when the groups.yahoo.com spam was happening which pulled down those pages and found that every single one of them contained html like this:

<font color="red" size="6"><b>CLICK HERE TO ENTER!</b></font></a>

I've updated it to do http requests on the .cn domains now too. It uses memcache to avoid repeated requests for the same websites.

This is usually the point where someone asks for the source code, even though it's not fully ready for other people to use, so I've temporarily stuck it up at https://secure.grepular.com/WebsiteScanner/ in case anyone wants to pick it a part and use bits of it.

--
Mike Cardwell - IT Consultant and LAMP developer
Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/

Reply via email to