Hi, This is something that I have built and operational for the organization that I work for. I decided in the end to operate on the main .db files and use the +/- mechanism, in addition to creating new databases. For example,
porn.db contains filtered1.sexsite.com user requests to unfilter sexsite.com (and assumes *.sexsite.com will be unfiltered) Unless the porn.db files are updated, (i.e. not just adding to another !notfiltered database) then squidguard can incorrectly pass or fail sites. I have an engine in REXX that * Maintains .db files through the use of .diff files and text caches * Provides the users with an easy-to-use Winblows control applet * Authenticates users so that only priviledged users can use control system * Allows users to unfilter/filter URLs/Domains, or query the status of a site * Allows users to purge a URL from the cache * Allows administrators (access to applet uses granular permissions) to restart server or squid or get server statistics. * Logs all applet requests & serializes requests to db manager * Permit or deny internet access (e.g. cache access) on a machine or room/building of machines basis * Automatically downloads .db file updates from SquidGuard site and builds them into running database, preserving user changes (this part is still WIP). * Has a server command-line version of applet that allows scheduling of access/URL filter and batch jobs If anyone is interested, please let me know. The system is called "College Web Control" and runs on eCS/Warp. The cache is 30Gb in size and many millions of objects in the cache. The server is a PIII 1Ghz. The server runs Squid 2.3STABLE4 and is very fast & reliable, much better than MS Proxy... I was considering marketing the system, but if not, it will be made available open-source. I realise most of you run some sort of linux/unix but there are REXX engines for linux and/or porting to Perl would not be too bigger a deal. Regards, Steve. >James, is this something you are creating for a client? Or is it a >public subscription service that you will market? Or maybe you will be >selling the application, or even open-source? > It started as a project for a client but since we use squidGuard ourselves and have several other clients using it who have asked for something along these lines we've decided to build it and work on the subscription service angle for now. >It sounds like it will be a subscription service. Will you be running >some sort of robot on a regular basis to update the blacklists? I would >see that as the primary service that one would be willing to pay for. >The web-based administration would be icing on the cake, but I wouldn't >want to pay for the icing without the cake. > For now I was planning on providing the ability to download just those changes you want to add to your copy of the blacklist database. Since I'm already downloading the blacklist database on a weekly basis I could provide a link to our copy, but it would probably be better if you had a script in place that pulled down a copy to your machine and then merged in the changes that we are keeping track of for you. >I like the direction that you are heading, but I'm not sure I understand >the purpose of the .local files that you've described. Am I correct that >in this scenario my .local files would be utilized *instead* of updating >my personal database at your site? What does the .local file process >give you that you wouldn't have with the existing .diff files? > The .diff files are only good if you have a .db file currently built. In the scenarios we are running squidGuard in, we don't always have the .db files built (for reasons of size, etc.). The .local file is a file that squidGuard is not using and so should always be left alone, unless you blow away the entire blacklist directory. :) The contents of the .local file are added to the domains or urls file (which should be replaced with the original version each time you do an update) and then you rebuild the domains or urls file to get the version you just created. Look at the squidGuard.conf file that was attached for an example of this in action. >In fact, the .local files seem to ignore the fact that it is equally >important that we are able to *remove* entries from the robot-generated >file. For example: > I could take that into account, but that would probably require a .local-add and .local-remove file or use the +/- format to signal (but I'm just using grep and cat to determine if I should add a line to the domains file or not). I've been going on the assumption of if a site is blocked in porn (cnn.com) that you just add it to the allowed/domain file and your good to go. If enough people think it should just be removed from the porn database entirely, then I've got some more design to do with the database structures and how the program allows you to enter blacklist related entries. >I'd like to have a web function that would allow me to review all users' >+/- entries, by group (porn, ads, warez, etc). All of the entries would >be alphabetized as a single group with no indication of the contributing >user. Entries that I am already using would not show on this list. There >would be a check box (or radio button) next to each entry, and I could >click the ones that I wanted and when I submitted the page they would be >added to my personal +/- list. > This wasn't something we planned for but might be implementable in a future version. > >I guess that's enough for now. ;-) > >Are you sorry you asked? > No, since this is the only way to actually design/implement the best possible solution possible. -- James A. Pattie [EMAIL PROTECTED] Linux -- SysAdmin / Programmer PC & Web Xperience, Inc. http://www.pcxperience.com/
