"(don't you love C?)"

I have never understood why the originators of C didn't give integers
explicit widths in bits: their scheme made C code often non-portable.

When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 bits
while longs were 64 (unlike "standard" C). This made Alpha C code not
portable to lesser CPUs. On the other hand, when I wrote C on DOS for
the IBM PC in the late 1980s, ints were only 8 bits! It took some time
to figure out why my C-compliant code failed so badly. In spite of all
that, having started programming before C was invented, I can safely
say that C is better than its predecessors for software like ClamAV.

P.S. Good code these days tends to use typedefs defining things like
int32, uint64 etc. A shame the original ClamAV coders didn't do that.



On Tue, 3 Nov 2020 01:53:33 +0000
"Micah Snyder (micasnyd)" <micas...@cisco.com> wrote:

> I hadn't really looked at the code. You raise a good point.
> 
> Changing it isn't super simple.  The info.blocks variable is passed through 
> cli_scandesc_callback() and scan_common() where it's placed into the scan 
> context.  When data is scanned, the amount scanned is divided by 
> CL_COUNT_PRECISION (also found in clamav.h), which is what you multiply the 
> number by to get the value in bytes. Provided that all downstream 
> applications use CL_COUNT_PRECISION as clamscan does, we could shrink the 
> count precision from 4k to something lower, but that would also decrease the 
> max amount of data which could be scanned.  
> 
> If the variable were a uint64_t, that'd probably be fine... but it's an 
> unsigned long int... aka maybe 4 bytes or maybe 8 bytes (don't you love C?).  
> On systems where an unsigned long is 4 bytes, then that'd cap the scan limit 
> at 4GB.  Changing the variable to be an uint64_t would be "best", but it 
> would be a non-backwards compatible change to the API which is very much not 
> worth it. 
> 
> Sigh :-/
> 
> > -----Original Message-----
> > From: clamav-users <clamav-users-boun...@lists.clamav.net> On Behalf Of
> > Paul Kosinski via clamav-users
> > Sent: Monday, November 2, 2020 5:23 PM
> > To: clamav-users@lists.clamav.net
> > Cc: Paul Kosinski <clamav-us...@iment.com>
> > Subject: Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned
> > 
> > Can this really be done? I was looking at the code referred to by G.W.
> > Haywood, and I see that it uses "info.blocks" and "info.rblocks".
> > Looking at the definitions in "clamav-0.103.0/clamscan/", I see the
> > following:
> > 
> > struct s_info {
> >     unsigned int sigs;         /* number of signatures */
> >     unsigned int dirs;         /* number of scanned directories */
> >     unsigned int files;        /* number of scanned files */
> >     unsigned int ifiles;       /* number of infected files */
> >     unsigned int errors;       /* number of errors */
> >     unsigned long int blocks;  /* number of *scanned* 16kb blocks */
> >     unsigned long int rblocks; /* number of *read* 16kb blocks */ };
> > 
> > This suggests that the counts for "scanned" and "read" are not really byte
> > counts, and EICAR's 68 bytes would always be recorded as 0 (if normal
> > rounding rules are applied).
> > 
> > 
> > 
> > On Mon, 2 Nov 2020 23:59:20 +0000
> > "Micah Snyder \(micasnyd\) via clamav-users" <clamav-users@lists.clamav.net>
> > wrote:
> >   
> > > I agree.  We already have some logic in freshclam to convert bytes to 
> > > human  
> > readable B / KiB / MiB / GiB format.  It should be pretty much a copypaste
> > effort to improve the data scanned/read output.  
> > >
> > > -Micah
> > >
> > > On 11/2/20, 9:47 AM, "clamav-users on behalf of G.W. Haywood via clamav- 
> > >  
> > users" <clamav-users-boun...@lists.clamav.net on behalf of clamav-  
> > us...@lists.clamav.net> wrote:
> > >
> > >     Hi there,
> > >
> > >     On Mon, 2 Nov 2020, Paul Kosinski via clamav-users wrote:
> > >  
> > >     > ... I still think it is a bad message that should be fixed.  
> > >
> > >     +1
> > >
> > >     If you want to try a very quick and dirty tweak to get more precise
> > >     numbers, change the value of
> > >
> > >     1) CL_COUNT_PRECISION in .../libclamav/clamav.h from 4096 to 1
> > >
> > >     2) replace '1024' with '1' in four places in clamscan/clamscan.c
> > >
> > >     3) change 'MB' to 'Bytes' in two places in clamscan/clamscan.c and
> > >
> > >     4) rebuild.
> > >
> > >     
> > > 8<----------------------------------------------------------------------
> > >     ~/clamav-0.103.0-rc2: $ grep -C3 -r CL_COUNT_PRECISION clamscan  
> > libclamav | ...  
> > >     ...
> > >     ...
> > >     clamscan/clamscan.c:        mb = info.blocks * (CL_COUNT_PRECISION /  
> > 1024) / 1024.0;  
> > >     clamscan/clamscan.c:        logg("Data scanned: %2.2lf MB\n", mb);
> > >     clamscan/clamscan.c:        rmb = info.rblocks * (CL_COUNT_PRECISION 
> > > /  
> > 1024) / 1024.0;  
> > >     clamscan/clamscan.c:        logg("Data read: %2.2lf MB (ratio 
> > > %.2f:1)\n",  
> > rmb, info.rblocks ? (double)info.blocks / (double)info.rblocks : 0);  
> > >     ...
> > >     ...
> > >     libclamav/clamav.h:#define CL_COUNT_PRECISION 4096
> > >     ...
> > >     ...
> > >
> > > 8<--------------------------------------------------------------------
> > > --
> > >
> > >     This is untested, YMMV.  Obviously, if you're skilled in the art, this
> > >     can be done better.  Note that 'MB' should in any case be 'MiB' as the
> > >     values printed are the counts divided by 2^20 and not by 10^6.
> > >
> > >     --
> > >
> > >     73,
> > >     Ged.  


_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Reply via email to