First, great work Jeff. 

>
>While 102k domains isn't nearly as large as the 2.3M in dmoz,
>it's certainly more than the 12k or so whitelist records we
>currently have.  How does the intersected list look as a
>potential whitelist?
>
>  http://spamcheck.freeapp.net/whitelists/wikipedia-dmoz.srt
>

I think this is just plain nuts to whitelist all of these! Why? If we don't
try to whitelist the most popular sites, then what the heck it the point? We
could whitelist millions of legit domains forever. The popular ones are the
most important. 

Here is one from the above list. Why would listing this help us?
http://oigawa-railway.co.jp/
(looks like a real popular site huh!)

>Please also take a look at these blocklist hits (potential FPs)
>and share what you think:
>
>  
http://spamcheck.freeapp.net/whitelists/wikipedia-dmoz-blocklist.summed.txt

I picked of few of these that may give us problems, and none of them met our
current criteria to list. (sissy-world.com, good grief that had to be a man
at one time!) With the ability to now see whitelisted domains in the
crossref page, I don't see a problem with whitelisting all these on the
list. Because if they do start spamming again, we can see they are
whitelisted and remove them. 


so: 
-1 for adding all those intersected to WL

+1 for whitelisting the blacklist hits.

--Chris

Reply via email to