My apologies for the poor formatting on this - it's Outlook Express. I saw this and thought I'd add two penneth worth. Just a bit of background - I control 4 squid caches, used by quite a few schools and about a score of local organisations, with about half a dozen commercial firewalls/web appliances plumbed in at various locations to provide alternative filtering and transparent caching.
Usual disclaimers apply - I speak for me - not for my organisation. >> I'm curious if your son used general search engine to find, or is there >> some bulletin board or specialized search engine somewhere that helped him >> to find so many hits? I'd just like to explain here, that porn sites aren't some kind of homogenous mist, and if squidguard knows about 90% of them, then you aren't going to be blocked 90% of the time. Think of it as a tree with lots of branches. If squidguard doesn't know about the branch you are on, then you will have virtually free reign. If you are on a branch SG knows about, then you can't get anywhere. Companies which have lots of websites tend to heavily crosslink within themselves, but rarely link to other companies. >I wondered about that, too. He used <http://www.teoma.com/> and I've >found that my expressionlist is not as effective on searches made from >that site. It appears that a google search, for example, sends the >search terms at the end of the string, where teoma sends the search >terms embedded in the string. I'm not sure that's the difference, but >I was catching some terms on google and missing the very same terms >on teoma. I have this problem as well. At the moment, I have a good enough exception list such that the kids can only get their "fix" from google. I find that the URL http://images.google.com/images?q=blowjob&ie=UTF-8&oe=UTF8&hl=en is not blocked but http://images.google.com/images?q=blowjob is blocked. Looks like a bug to me. >> You think the commercial solutions are any better? > >I'd really like to think so. (But I don't know for sure.) They aren't. Last year I was in the situation of having commercial firewall providing their blocking, then using transparent redirection to the SG boxes to try to plug the gaps. My experience of commercial boxes (more than one type) - their false positives are worse (blocking commercial rivals really REALLY annoys me), their false negatives are worse. One of the ones I used only blocked on IP address which keeled over when given web-clusters and virtual domains. The management facilities were....poor - in a steaming-pile sort of way. There was no way of replicating filter lists across appliances, so you had to type your exceptions in on *every* box. Of course they were very closed, very black boxes, so you couldn't fix any of it. The approx 700 GBP annual subscription per box wasn't really an issue for me since it isn't my money - but it is my time. Since squidguard, most of the subscriptions have been allowed to lapse. >> Last time I checked (last year), the major commercial products >> databases were not really accessible by SquidGuard and there wasn't any >> interest in this effort. > >I can understand that. It's one thing to try and protect your assets >when you are selling programs; how are you going to protect a text file >of domain names? We'd need to meet them halfway on this. Does the >Berkeley db support encryption? Perhaps the purchase of a subscription >could allow you to download the latest encrypted db, then an expiring >certificate would enable Berkeley to decrypt and load the db. Maybe >they would be a bit more accommodating of us in that environment? You don't need encryption - just one way hashing with a sufficiently large output value, like CRC 128. >> 1) coax more participation in database sharing and work on more >> effective means to automate this process to allow greater numbers to >> cooperate in the joint maintenance of tables. > >I will be glad to share my scripts and blacklists, but I'll need more >convincing before I'll agree to take someone else's changes carte >blanche. (Specifically - deletions.) You've probably noticed that we >don't all think alike. :-) I too will be glad to share scripts and blacklists. I think the problems with changes could be accomodated by having sufficiently detailed descriptions. Those of us who deal with schools have the problem of a model wearing a *tight* bikini and scratching behind her ears with her toes isn't classically "pornography" - but we need to block it nevertheless. Perhaps a "glamour" catagory? Also we would need to come to a common arrangement on how the lists are merged. At the moment I have something like "pass my_exceptions !my_localporn !dmoz_lists !squidguard_lists !tolouse_lists all" and I also use pre-generated database files. All my scripts reflect that arrangement. >> 2) investigate additional ways of finding and listing no-no sites >> besides just the current robots. > >I'd agree with that. The best way is to go through your logs looking for dodgy words. I have a webmin script for that... >> 3) investigate working deals with commercial products > >As mentioned earlier, I'd agree with that, too. > >Good stuff - thanks! > >Rick Hope this helps Regards Anthony
