> I'm curious if your son used general search engine to find, or is there > some bulletin board or specialized search engine somewhere that helped him > to find so many hits?
I wondered about that, too. He used <http://www.teoma.com/> and I've found that my expressionlist is not as effective on searches made from that site. It appears that a google search, for example, sends the search terms at the end of the string, where teoma sends the search terms embedded in the string. I'm not sure that's the difference, but I was catching some terms on google and missing the very same terms on teoma. > You think the commercial solutions are any better? I'd really like to think so. (But I don't know for sure.) > Last time I checked (last year), the major commercial products > databases were not really accessible by SquidGuard and there wasn't any > interest in this effort. I can understand that. It's one thing to try and protect your assets when you are selling programs; how are you going to protect a text file of domain names? We'd need to meet them halfway on this. Does the Berkeley db support encryption? Perhaps the purchase of a subscription could allow you to download the latest encrypted db, then an expiring certificate would enable Berkeley to decrypt and load the db. Maybe they would be a bit more accommodating of us in that environment? > 1) coax more participation in database sharing and work on more > effective means to automate this process to allow greater numbers to > cooperate in the joint maintenance of tables. I will be glad to share my scripts and blacklists, but I'll need more convincing before I'll agree to take someone else's changes carte blanche. (Specifically - deletions.) You've probably noticed that we don't all think alike. :-) > 2) investigate additional ways of finding and listing no-no sites > besides just the current robots. I'd agree with that. > 3) investigate working deals with commercial products As mentioned earlier, I'd agree with that, too. Good stuff - thanks! Rick > -----Original Message----- > From: Jerry Winegarden [mailto:[EMAIL PROTECTED]] > Sent: Friday, June 28, 2002 12:03 PM > To: Rick Matthews > Cc: Squidguard Mailing List > Subject: Re: Think you're doing a good job of blocking porn? > > > On Thu, 27 Jun 2002, Rick Matthews wrote: > > > I *thought* I was... > > > > My blacklists are automatically updated each week. Every week I download > > the blacklists from the squidGuard site: > <snip> > > the adult list from the Universit� of Toulouse: > <snip> > > and the adult section from the dmozlists: > <snip> > > and combine them with my local updates (over 12,000 of them). After > > removing duplicates, my porn database is 223,225 domains and > > 105,561 urls. I also run an expressionlist in my porn destination group > > and programmatically shut down internet access after midnight. > <snip> > > spent about 90 minutes surfing. I pulled a few squid reports this > > morning and he spent the entire time looking at porn. Many, *many* > > megabytes of porn. He was being blocked on about 20 to 25% of his > > attempts. > > 20-25% does seem a bit low, (even for social scientists accustomed to > regression R's of .4 or .5! Bad inside statistics joke) > > I'm curious if your son used general search engine to find, or is there > some bulletin board or specialized search engine somewhere that helped him > to find so many hits? > > <snip> > > > > > I've got way too much effort in this for a 20% success rate. > > > > I'm looking for a commercial solution. > > You think the commercial solutions are any better? Well they might be > more up around 50% or 70% or maybe even 90% at any one moment, but that > still doesn't solve the problem. And they are getting paid to be > good at it. Nor does it solve your problem if your son is adept at > targeted searches. > > In my case, (or my clients' cases), current commercial solutions do not > appear to be feasible because of economics and technology. > > Economics: > In your case, > if you have 2-3 machines, you're looking at perhaps $30US / 6 months for > a commercial service such as Cyberpatrol or NetNanny. In my case, I > support several community centers which have almost $0 support budget. > Since you're only as good as your last update in the filtering wars, this > is a major obstacle. I haven't checked this year, but as of last year, > there weren't any major price breaks available to community centers to > make it even close to feasible to use commercial services. > Technical: > 1) Lack of flexibility - lack of ACL's in commercial products, > lack of granularity in restricting access by a wide > variety of individuals (children and adults are often > served in most community centers I support). > 2) Lack of interoptibility of commerial databases with SquidGuard, > which does provide a sufficiently-granular framework. > Last year I proposed to a couple of major commerial vendors to > work with them to develop a version of their databases or a conversion > tool for their databases in order to purchase database updates from them > but to run under SquidGuard, so that ACL's could be used. > Last time I checked (last year), the major commercial products > databases were not really accessible by SquidGuard and there wasn't any > interest in this effort. I even offered to develop the conversion tool > (plenty of students and other programmers hanging around RTP-area of NC) > if database structure was revealed (under non-disclosures). No takers. > Perhaps afraid of open-sources? Perhaps just too busy keeping up with > their current business to worry about developing an additional revenue > stream? (e.g. get ISP's filtering business. I know AOL was involved > with at least one of the filtering companies) > > So, although commercial may be worth a shot for you immediately, > it certainly is not for me. Also, although I haven't done any > analysis of the filtering service lately in my community centers > (some have their own servers, some use my squidGuard server), I also > haven't had any complaints for awhile, either. I've also merged lists > from the various sources and added any > sites that slip through if my clients find them and that's seemed to keep > them happy. > > The problem with the free squidGuard lists is the sheer enormity of the > amount of work required to keep them up2date (all puns intended.) > The commercials get paid to spend a lot of effort to search and search > and refine. It's hard to automate the whole effort and so it's hard for > open sources community to compete like that. However, we do have one > thing that the commercials don't have - potentially huge numbers of people > and machines to throw at the effort. With those 3 lists plus yours, how > many people and machines are involved in getting your lists prepared? > Perhaps a handful, and volunteers at that? > > It seems to me that we need to follow 3 strategies: > > 1) coax more participation in database sharing and work on more > effective means to automate this process to allow greater numbers to > cooperate in the joint maintenance of tables. Right now, we run the > dumb robots and we add a few sites here or there that we are told about > and we merge about 3 versions of these lists. There must be some tools > and proceedures we can invent to improve this process of sharing what > we do have. > > 2) investigate additional ways of finding and listing no-no sites > besides just the current robots. The reason I asked about how your > son found those sites is that there may be some other resources or > techniques that he used to find them - why not exploit those resources? > Perhaps it's a bit like trying to enlist the services of (former) crackers > in order to tighten your system's security? > > 3) investigate working deals with commercial products to produce a > squidGuard version or conversion tools so that there are more options > available for blacklist maintenance? > > > > > Rick > > > > > > > > -- > *************************************************************************** > Jerry Winegarden OIT/Technical Support Duke University > [EMAIL PROTECTED] http://www-jerry.oit.duke.edu > *************************************************************************** > >
