Dennis Peterson wrote [reordered]:

>You didn't say what your iowait rate was during your scan (from top, for 
>example). If you have multiple disks/arrays you can also fire off 
>multiple scanning sessions as I doubt you're pegging the cpu's. This 
>doesn't work well if you're on a set of mirrored disks.

Sorry, as I implied in my original post, I *am* pegging the CPU. 
Also, the IOWAIT is minimal.
Here is a typical TOP report:

  Tasks:  74 total,   3 running,  71 sleeping,   0 stopped,   0 zombie
  Cpu(s):  1.0% us,  0.3% sy, 98.3% ni,  0.0% id,  0.3% wa,  0.0% hi,  0.0% si
  Mem:   1036320k total,  1020416k used,    15904k free,     2148k buffers
  Swap:  3164796k total,     3176k used,  3161620k free,   930472k cached

    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  26185 samba     35  10 17040  15m  988 R 98.2  1.5   2033:10 clamscan

(This is a SOHO setup using a simple Linux MD software mirror with two 320 GB, 
7200 rpm, ATA/100 disks.)


>Don't scan every file every day - that makes no sense. Just scan files 
>that have changed since the previous scan (google tripwire and similar 
>tools).
>
>FWIW I differentially scan a multi-terrabyte (50 t to be exact) file 
>system in just over a couple hours with clamav. The first time I scanned 
>it required a couple weeks.
>
>BTW, I had a scanner that didn't do a full file scan and I got rid of it 
>too - good choice.

If you use a Tripwire-like technique, I presume you have a list of SHA1 (or
better) hashes for all the files (MD5 is considered rather weak now, and even
SHA1 is under attack), since merely listing the files' timestamps and sizes is
too easy to fool.

Since computing hashes (no matter what kind) almost certainly requires reading
each byte of each and every file, and you have about 50,000 GB of files, if you
can do the differential scan in say 10,000 seconds (2.78 hours), you must be
computing hashes at about 5 GB per second. Wow! Fortunately, with my 320 GB, 
I would only have to compute hashes at about 32 MB/s. But that's still running 
my disks flat out, and probably beyond my CPU.

I don't remember any way for Tripwire to call an arbitrary program when it
detects a file difference, but I suppose you can just feed the report of changes
to some script which then invokes ClamAV.

I'll have to think about this, as it's becoming a lot more complicated than I
had expected.

PK

_______________________________________________
http://lurker.clamav.net/list/clamav-users.html

Reply via email to