Hi there, On Tue, 3 Nov 2020, Gary R. Schmidt wrote:
... I've written C code that is still in use on everything from 80286s to DEC Alphas and Power and SPARC64 and PA-RISC ...
Hehe, I wrote our invoicing, stock control and accounting suite in C starting around 1986. Originally it ran under DOS on the Apricot Xi _and_ it's multi-user _and_ it _still_ runs under plain ordinary DOS, or FreeDOS, on anything that can run DOS, _and_ it runs under Linux using exactly the same database. It all compiles from the same source using DosBOX and Borland/Zortech C for DOS or gcc natively. It took a while to get gcc to OK it, especially the newer (ca. 1990 :) C++ bits.
... none of this has *anything* to do with the original problem ...
Er, quite so. The problem has already been discussed on the list. I think Micah has looked a bit harder than I at the byte counting code, and there may well be issues which I haven't seen if you change the way that blocks are counted. That's why I said my suggestion was untested and YMMV. But it was a very small job to make the changes and I just gave it a spin to see what happens. So now it's still not really what I would call tested, but it does compile, it runs and it produces results. I picked a directory that contains a few fairly large files - at least they're large compared with anything I'd normally scan with my mail administrator's hat on - on which to exercise it: 8<---------------------------------------------------------------------- $ ls -lR /EXPORTS/log/ /EXPORTS/log/: total 384696 -rw-r----- 1 root adm 5293079 Nov 4 13:51 auth.log -rw-r----- 1 root adm 191611794 Nov 4 13:51 authpriv.log drwxr-xr-x 2 _chrony _chrony 4096 Mar 15 2020 chrony -rw-r----- 1 root adm 95743836 Nov 4 13:51 cron.log -rw-r----- 1 root adm 9658075 Nov 4 12:54 daemon.log -rw-r----- 1 root adm 4309854 Nov 2 16:05 kern.log -rw-r----- 1 root adm 47876683 Nov 4 13:47 mail.log -rw-r----- 1 root adm 673219 Nov 4 13:35 syslog.log -rw-r----- 1 root adm 38712514 Nov 4 13:51 user.log /EXPORTS/log/chrony: total 53816 -rw-r--r-- 1 _chrony _chrony 19844910 Nov 4 13:51 measurements.log -rw-r--r-- 1 _chrony _chrony 16554307 Nov 4 13:51 statistics.log -rw-r--r-- 1 _chrony _chrony 18692664 Nov 4 13:51 tracking.log $ du -bc /EXPORTS/log/ 55095977 /EXPORTS/log/chrony 449011895 /EXPORTS/log 449011895 total 8<---------------------------------------------------------------------- $ clamscan -r --debug --verbose --stdout --statistics=pcre \ --detect-pua=yes --alert-exceeds-max=yes --max-scantime=0 \ --max-filesize=500M --max-scansize=500M --disable-cache \ /EXPORTS/log > /home/ged/clamscan_EXPORTS_log 2>&1 ... /EXPORTS/log/daemon.log: YARA.Sanesecurity_Spam_test.UNOFFICIAL FOUND ... ----------- SCAN SUMMARY ----------- Known viruses: 13556894 Engine version: 0.103.0-rc2 Scanned directories: 2 Scanned files: 12 Infected files: 1 Data scanned: 494860005.00 Bytes Data read: 449076137.00 Bytes (ratio 1.10:1) Time: 2339.015 sec (38 m 59 s) Start Date: 2020:11:04 13:58:13 End Date: 2020:11:04 14:37:12 8<---------------------------------------------------------------------- Notes: Obviously the printf() statements could be tidied up to print integers, but scanning 200MByte files seems to be no problem with these changes. The YARA.Sanesecurity_Spam_test string is indeed found in daemon.log, apparently I was testing some Yara rules. As you can see there was a little more data in the logfiles after the scan but that's to be expected of course as the logs are live. Some of the files give "Data scanned" values of three or four times the "Data read" values even though they're plain text. Thoughts? I haven't investigated. -- 73, Ged. _______________________________________________ clamav-users mailing list clamav-users@lists.clamav.net https://lists.clamav.net/mailman/listinfo/clamav-users Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml