Thanks for taking the time to post this; I really appreciate it. I spent some time nosing around the dmoz site (I'll admit this is the first I've heard of it). Do you know where to find the guidelines of what goes into each category?
I'm currently combining blacklists from 2 sources on a weekly basis, and your adult domains contained almost 16,000 entries that I did not have. I'm glad to see this new source; we really need the help to maintain a fairly decent list. The squidGuard web site says that the robot runs 3 times a week; that would be 39 runs since the beginning of the year. In reality the robot has run 5 times since December 20th, and looking at its results is not a real confidence booster: Run Net Change Net Change Date to Domains to URLS 1/15/02 +57,175 -4,332 1/31/02 + 3,430 +1,305 2/15/02 +36,748 + 464 3/13/02 + 548 - 710 3/20/02 - 295 - 646 Thanks again, Rick Matthews -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Masanori Harada Sent: Friday, April 05, 2002 12:53 AM To: Squidguard Mailing List Subject: dmozlists Hello all, I wrote a script which extracts URLs from the RDF dump of Open Directory Project (available at http://dmoz.org/rdf.html) and converts them into urls/domains rules for squidGuard. Especially, rules extracted from dmoz.org/Adult/ and dmoz.org/Kids_and_Teens would be useful as they are quite big and checked by human editors, I hope. dmozlists/adult/domains 31995 lines dmozlists/adult/urls 61394 lines dmozlists/kids_and_teens/domains 5783 lines dmozlists/kids_and_teens/urls 10003 lines You can get the script and its output at: http://www.ingrid.org/~harada/filtering/ Enjoy! -- Masanori Harada NTT Network Innovation Laboratories
