Chris wrote:
On Sunday 04 March 2007 4:15 pm, Dennis Peterson wrote:
Steve, since I'm using a script that was posted here quite some time ago
what changes need to be made:
Create a text file, msrbl.list, with these two lines:
MSRBL-SPAM.ndb
MSRBL-Images.hdb
Run rsync and call that file, and the rsync URI from above:
rsync -aq --files-from=/path/to/msrbl.list \
rsync://rsync.mirror.msrbl.com/msrbl/ /path/to/pattern-files
This will download both the spam and image files in one invocation of
rsync, and put them in the directory pointed to by /path/to/pattern-files.
Run validation checks as before to be sure they won't break clamd. The
msrbl.list is to allow the thing to scale in the event msrbl adds more
lists to their fine services.
dp
Hopefully I've done this right, a new MSRBL-images.hdb and MSRBL-spam.ndb were
downloaded to /var/tmp/clamav, appeared to be tested and moved
to /var/lib/clamav. Here is what I now have in the script:
rsync -aq --files-from=/usr/local/bin/msrbl.list \
rsync://rsync.mirror.msrbl.com/msrbl/ /var/tmp/clamdb
test -s $tmp_dir/MSRBL-SPAM.ndb && \
clamscan --quiet -d $tmp_dir/MSRBL-SPAM.ndb && \
cp --reply=yes MSRBL-SPAM.ndb MSRBL-SPAM.ndb-bak && \
mv -f $tmp_dir/MSRBL-SPAM.ndb .
test -s $tmp_dir/MSRBL-Images.hdb && \
clamscan --quiet -d $tmp_dir/MSRBL-Images.hdb && \
cp --reply=yes MSRBL-Images.hdb MSRBL-Images.hdb-bak && \
mv -f $tmp_dir/MSRBL-Images.hdb .
I ran it twice and both times it downloaded a new .hdb and .ndb file at least
the 'modified' times were within a couple of minutes of the current time.
I've commented out the
I just now realized you're moving the downloaded file to the ClamAV
working directory rather than copying it. By doing this you defeat one
of the truly great things about rsync - intelligent copies. For small
files this isn't a big deal but for for very large files rsync has to
download the entire thing even though it may have only changed in the
last few lines. I'll give you an example - stop me if you've heard this...
A web server log can grow by several megs each day. At the end of each
day you'd like to have a copy of that log, now nearly 20g in size, sent
to your activity reporter. Rather than copying that 20g+ and growing
file, you use rsync. Rsync will look at the remote file and compare it
with your local file and send only the differences - that would be the
changes made that day.
In order to do this it has to have a previous copy to compare against
which is why moving your file as you do negates this feature.
The other thing rsync does is some quick math to see if the source file
is changed from the local copy, and if there's been no changes the
process stops. Very efficient.
The rsync process allows you to optimize for cpu usage or bandwidth when
the source file has changed. It takes some cpu power to make the file
comparisons but very little bandwidth. If you optimize for cpu usage
then bandwidth suffers as you have to transfer entire files. Rsync does
a poor job for text files that are compressed at each update because the
entire zip file is different even when small changes are made so it ends
up having to transfer nearly all of the file each time even though the
unzip text may have changed very little.
I have a remote and very isolated server. It's a 1U Sun sparc with no
tape drive, no expansion slots, no nothing, and I need to back it up to
my NAS across the state. I use rsync to copy only the changed parts of
existing files, or entire files if they are new, to my NAS where they
are then put on tape. That's a very busy little server with dozens of
web servers, mail lists, user accounts, etc., and it takes very little
time and bandwidth to refresh the local data thanks to rsync.
My guess is the MSRBL folks would like it if you downloaded the new
files only if the file has been modified.
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://lurker.clamav.net/list/clamav-users.html