Thanks for taking the time to post this; I really appreciate it.

I spent some time nosing around the dmoz site (I'll admit this is
the first I've heard of it). Do you know where to find the
guidelines of what goes into each category?

I'm currently combining blacklists from 2 sources on a weekly basis,
and your adult domains contained almost 16,000 entries that I did
not have. I'm glad to see this new source; we really need the help
to maintain a fairly decent list.

The squidGuard web site says that the robot runs 3 times a week;
that would be 39 runs since the beginning of the year. In reality
the robot has run 5 times since December 20th, and looking at its
results is not a real confidence booster:

Run             Net Change              Net Change
Date            to Domains              to URLS
1/15/02 +57,175         -4,332
1/31/02 + 3,430         +1,305
2/15/02 +36,748         +  464
3/13/02 +   548         -  710
3/20/02 -   295         -  646

Thanks again,
Rick Matthews

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Masanori
Harada
Sent: Friday, April 05, 2002 12:53 AM
To: Squidguard Mailing List
Subject: dmozlists


Hello all,

I wrote a script which extracts URLs from the RDF
dump of Open Directory Project (available at
http://dmoz.org/rdf.html)
and converts them into urls/domains rules for squidGuard.

Especially, rules extracted from dmoz.org/Adult/ and
dmoz.org/Kids_and_Teens would be useful as they are
quite big and checked by human editors, I hope.

    dmozlists/adult/domains         31995 lines
    dmozlists/adult/urls            61394 lines
    dmozlists/kids_and_teens/domains 5783 lines
    dmozlists/kids_and_teens/urls   10003 lines

You can get the script and its output at:
  http://www.ingrid.org/~harada/filtering/

Enjoy!
--
Masanori Harada
NTT Network Innovation Laboratories

Reply via email to