Thanks to Fabrice Prigent, the attached scripts have been updated and
corrected. (My perl statements were missing a "^" and were removing ALL
of the "-", instead of the leading "-".) The error was introduced after
component testing and I missed it. My apologies to the group, and my
thanks to Fabrice!
------------------------------------------------------

Attached files:
update_blacklists
update_blacklists_nodownload

Setup instructions:

- Change the BLACKDIR= entry to match the dbhome path declaration in
your squidGuard.conf file
- Create /blacklists/adult directory. (For the Universit� Toulouse
blacklist.)
- Create /blacklists/adult/archive directory
- Create /blacklists/porn/archive directory
- Create /blacklists/porn/stats directory
- If you don't already have it, download and install wget
(http://wget.sunsite.dk/)
- Issue the following command: chown -R squid.squid
[insert_your_dbhome]/blacklists

NOTE: The files in the adult directory will be combined with the porn
files. It is not necessary to add an adult destination in your
squidGuard.conf file.

Operation:

The update_blacklists script downloads the blacklists file from the
squidGuard site, and downloads a second adult blacklist from the
Universit� Toulouse in France. The adult files (domains & urls) are
merged with the porn files and de-duped.

The downloaded blacklists files are stored in your dbhome directory with
the date and time downloaded embedded in the file name. The squidGuard
filename ends with _sg.tar.gz, the file from France ends with
_fr.tar.gz. These downloaded files will accumulate; you will need to
handle the housekeeping issues as you see fit.

The standard operation has not been altered. "domains.diff" and
"urls.diff" files will be processed as per the documentation if they
exist in each of your destination directories.

I have added additional functionality to the processing for the porn
destination.

The script will look for the files "domains_diff.local" and
"urls_diff.local" in the porn directory, and if they exist they will be
used in the creation of the final domains and urls files. The
_diff.local files contain +/- entries in the same format as the diff
files, but they do not require the housekeeping that is suggested for
the diff files. You can add entries to your _diff.local files without
worrying about duplications, and then you can <pretty much> forget about
them.

The last 4 copies of the /adult/domains and /adult/urls files can be
found in the /adult/archive directory, the "0" file is the most recent,
and the "-3" file is the oldest. This also holds true for the
/porn/domains and /porn/urls files; they are archived in the
/porn/archive directory.

A stats file is created and placed in the /porn/stats directory each
time the script executes. I've found this to be extremely helpful in
monitoring the condition (and implied validity) of the downloaded
blacklist files. These stats files will accumulate; you will need to
handle the housekeeping issues as you see fit.

The "update_blacklists_nodownload" script runs through the same
processing logic as "update_blacklists", but it uses the most recent
version of the porn and adult blacklists (the ".0" version). I have
"update_blacklists" set up as a scheduled cron job, and I run
"update_blacklists_nodownload" manually (as a Webmin custom command) if
I've made changes to the _diff.local files and I want them to take
effect immediately.

These work for me (YMMV). I tried to keep them simple and
straight-forward (e.g. no loops) so that most anyone (who wants to learn
and understand) can understand them. I'd like to hear about it if you
find errors, or if you make improvements that might help others.

Rick Matthews

Attachment: update_blacklists
Description: Binary data

Attachment: update_blacklists_nodownload
Description: Binary data

Reply via email to