Hi there,

On Tue, 3 Nov 2020, Gary R. Schmidt wrote:

... I've written C code that is still in use on everything from
80286s to DEC Alphas and Power and SPARC64 and PA-RISC ...

Hehe, I wrote our invoicing, stock control and accounting suite in C
starting around 1986.  Originally it ran under DOS on the Apricot Xi
_and_ it's multi-user _and_ it _still_ runs under plain ordinary DOS,
or FreeDOS, on anything that can run DOS, _and_ it runs under Linux
using exactly the same database.  It all compiles from the same source
using DosBOX and Borland/Zortech C for DOS or gcc natively.  It took a
while to get gcc to OK it, especially the newer (ca. 1990 :) C++ bits.

... none of this has *anything* to do with the original problem ...

Er, quite so.

The problem has already been discussed on the list.  I think Micah has
looked a bit harder than I at the byte counting code, and there may
well be issues which I haven't seen if you change the way that blocks
are counted.  That's why I said my suggestion was untested and YMMV.
But it was a very small job to make the changes and I just gave it a
spin to see what happens.  So now it's still not really what I would
call tested, but it does compile, it runs and it produces results.  I
picked a directory that contains a few fairly large files - at least
they're large compared with anything I'd normally scan with my mail
administrator's hat on - on which to exercise it:

8<----------------------------------------------------------------------
$ ls -lR /EXPORTS/log/
/EXPORTS/log/:
total 384696
-rw-r----- 1 root    adm       5293079 Nov  4 13:51 auth.log
-rw-r----- 1 root    adm     191611794 Nov  4 13:51 authpriv.log
drwxr-xr-x 2 _chrony _chrony      4096 Mar 15  2020 chrony
-rw-r----- 1 root    adm      95743836 Nov  4 13:51 cron.log
-rw-r----- 1 root    adm       9658075 Nov  4 12:54 daemon.log
-rw-r----- 1 root    adm       4309854 Nov  2 16:05 kern.log
-rw-r----- 1 root    adm      47876683 Nov  4 13:47 mail.log
-rw-r----- 1 root    adm        673219 Nov  4 13:35 syslog.log
-rw-r----- 1 root    adm      38712514 Nov  4 13:51 user.log

/EXPORTS/log/chrony:
total 53816
-rw-r--r-- 1 _chrony _chrony 19844910 Nov  4 13:51 measurements.log
-rw-r--r-- 1 _chrony _chrony 16554307 Nov  4 13:51 statistics.log
-rw-r--r-- 1 _chrony _chrony 18692664 Nov  4 13:51 tracking.log

$ du -bc /EXPORTS/log/
55095977        /EXPORTS/log/chrony
449011895       /EXPORTS/log
449011895       total
8<----------------------------------------------------------------------
$ clamscan -r --debug --verbose --stdout --statistics=pcre  \
  --detect-pua=yes --alert-exceeds-max=yes --max-scantime=0 \
  --max-filesize=500M --max-scansize=500M --disable-cache   \
  /EXPORTS/log > /home/ged/clamscan_EXPORTS_log 2>&1
...
/EXPORTS/log/daemon.log: YARA.Sanesecurity_Spam_test.UNOFFICIAL FOUND
...
----------- SCAN SUMMARY -----------
Known viruses: 13556894
Engine version: 0.103.0-rc2
Scanned directories: 2
Scanned files: 12
Infected files: 1
Data scanned: 494860005.00 Bytes
Data read: 449076137.00 Bytes (ratio 1.10:1)
Time: 2339.015 sec (38 m 59 s)
Start Date: 2020:11:04 13:58:13
End Date:   2020:11:04 14:37:12
8<----------------------------------------------------------------------

Notes:

Obviously the printf() statements could be tidied up to print integers,
but scanning 200MByte files seems to be no problem with these changes.

The YARA.Sanesecurity_Spam_test string is indeed found in daemon.log,
apparently I was testing some Yara rules.

As you can see there was a little more data in the logfiles after the
scan but that's to be expected of course as the logs are live.

Some of the files give "Data scanned" values of three or four times
the "Data read" values even though they're plain text.  Thoughts?  I
haven't investigated.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Reply via email to