On Tue, 20 Jul 2004 15:27:52 +0200, Marc Kool <[EMAIL PROTECTED]> wrote: > Hi Jeff, > > Jeff Chan wrote: > > Doing a little preliminary checking of this particular dataset > > leads me to wonder a little how appropirate it might be for > > SURBLs. In particular I found over a hundred whitelist hits of > > sites like aol.com, att.net, btopenworld.com, budweiser.com, > > clara.net, cnet.com, comcast.net, he.net, lsu.edu, match.com, > > mindspring.com, msn.com, rr.com, sina.com, texas.net, tripod.com, > > umich.edu, victoriassecret.com, washington.edu, etc.: > > > > http://spamcheck.freeapp.net/adult.domains.whitelist-hits > > I did a quick check on a few domains and I do not share your conclusion. > > # grep aol.com domains > adultaol.com > register.oscar.aol.com > sex-aol.com > sexonaol.com > usaol.com
register.oscar.aol.com is the server used by AOL messenger and ICQ to login - how on earth does this count as an Adult Website, much less a sex site?!! > # grep att.net domains > adultonly.home.att.net > borderjumper.home.att.net > brookeb.home.att.net > chrisd054.home.att.net > dating.home.att.net > divinenews.home.att.net > lilcindy.home.att.net > livevids.home.att.net > livevids2.home.att.net > livevids3.home.att.net > livevids4.home.att.net > models.home.att.net > models2.home.att.net > personals.home.att.net > pvelasquez.home.att.net > sasha69.home.att.net > sex-ads.home.att.net > sexworld.home.att.net > xxxmovies.home.att.net Ahh the plot thickens... Subdomains.. > # grep -w au.com domains > aotoys.au.com > condoms.au.com > freeporn.au.com > hornytoad.au.com > muff.au.com Still more.. > So aol.com and att.net and au.com are not in the database and not blacklisted. > no subdomain of aol.com is in the blacklist. What is register.oscar.aol.com if it isn't a subdomain? > For au.com and att.net there are only adult subdomains in the blacklist. > This is ok. However SURBL's in general don't use subdomains, I've just run a test on my personal SURBL and SpamCopURI doesn't currently look at subdomains. I suspect because of the requirement for a lookup per domain level which would obviously both make things inefficient and also leave room for a denial of service. > I assume that something went wrong when you verified the quality of the > database. I think the levels of understanding of what was in the DB and what SURBL was able to do were what went wrong. Given my very quick testing I think it would probably be worth giving this data a try, we would most likely need to work out how to remove the subdomained entries - the list is huge, and efficiency we can gain by removing excess data would obviously be useful. The data is somewhat preemptive - just because you have an adult content website doesn't always mean you are spamming, in fact I'm sure there are an awful lot of Adult sites which never spam. I do however feel that there is a need for this kind of data, there are a lot of organisations which have liability concerns if their users recieve pornographic messages (schools) and many people who find adult content offensive (churches etc). I reckon let's give it a go for a while like we did 6dos - what's the worst that can happen? We might get another SURBL - well more content is always a good thing in that case :) -- Regards, David Hooton