Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-04 Thread Paul Kosinski via clamav-users
Sorry, forgot to CC the list (sometimes Reply All *is* appropriate).



I think it's important that ClamAV *not* say there were 0.00 MB
scanned/read, while also saying a file *was* scanned & infected.
Contradictions like this [might be interpreted to] suggest a serious
problem in the logic.

Rounding 68 bytes to 0.02 MB also would suggest a bug of some sort.
Perhaps it would be best to be explicit and simply report 16 kb blocks
read and scanned, rather than megabytes read and scanned. Maybe there
is a better word than "blocks" (which suggests disk blocks), such as
"segments", "sequences", "pieces"?


On Wed, 4 Nov 2020 17:49:09 +
"Micah Snyder (micasnyd)"  wrote:

> Do you reckon folks will be less confused if it rounds up?
> 
> -Micah
> 
> On 11/3/20, 1:37 PM, "clamav-users on behalf of Paul Kosinski via 
> clamav-users"  clamav-users@lists.clamav.net> wrote:
> 
> If ClamAV always rounded up when counting the number of 16kb blocks,
> then it should be counting at least 0.016384 MB (or 0.015625 MiB) for
> tiny files. By normal rounding rules this should display as 0.02 MB/MiB.
> 
> 
> On Tue, 3 Nov 2020 17:50:18 +
> Mark Fortescue via clamav-users  wrote:
> 
> > Hi all,
> > 
> > I would call this a bug. Scanning 1 byte is the same as scanning 1 
> block.
> > 
> > When storing things in blocks is is always important to round up or you 
> > get a false impression of reality.
> > 
> > You can't store 100 bytes in 0 disk sectors of 128 bytes. It is always 
> 1 
> > disk sector.
> > 
> > Can you not just round up by adding (BlockSize - 1) bytes when setting 
> > the block variables ?
> > 
> > Regards
> > Mark.
> > 
> > On 03/11/2020 16:07, Paul Kosinski via clamav-users wrote:  
> > > "This is a display problem, not a storage problem."
> > > 
> > > I disagree. When the counts in info.blocks and info.rblocks are counts
> > > of 16kb *blocks*, keeping precise track of the reading and scanning of
> > > small files is impossible, no matter how clever the display code is.
> > > 
> > > 
> > > 
> > > On Tue, 3 Nov 2020 17:44:18 +1100
> > > "Gary R. Schmidt"  wrote:
> > > 
> > >> On 03/11/2020 16:00, Paul Kosinski via clamav-users wrote:
> > >>> "(don't you love C?)"
> > >>>
> > >>> I have never understood why the originators of C didn't give 
> integers
> > >>> explicit widths in bits: their scheme made C code often 
> non-portable.
> > >>>
> > >> Because C is intended to be very, very close to the machine
> > >> architecture, only a step or tow above assembler, or doing the
> > >> bit-twiddling by hand.
> > >>
> > >>> When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 
> bits
> > >>> while longs were 64 (unlike "standard" C). This made Alpha C code 
> not
> > >>> portable to lesser CPUs. On the other hand, when I wrote C on DOS 
> for
> > >>> the IBM PC in the late 1980s, ints were only 8 bits! It took some 
> time
> > >>> to figure out why my C-compliant code failed so badly. In spite of 
> all
> > >>> that, having started programming before C was invented, I can safely
> > >>> say that C is better than its predecessors for software like ClamAV.
> > >>>
> > >> Uh, not a good example, I've written C code that is still in use on
> > >> everything from 80286s (yes, Virginia, there are people who keep them
> > >> alive, not just because they're cheap, sometimes just because they
> > >> *can*) to DEC Alphas and Power and SPARC64 and PA-RISC, it's just a
> > >> matter of knowing what you are doing, and sticking to it...
> > >>
> > >>> P.S. Good code these days tends to use typedefs defining things like
> > >>> int32, uint64 etc. A shame the original ClamAV coders didn't do 
> that.
> > >>>
> > >> And none of this has *anything* to do with the original problem - 
> seeing
> > >> 0 when the value is 0.01, or so.
> > >>
> > >> This is a display problem, not a storage problem.  You could declare
> > >> something as PIC(999.99) and you will still only see > 0
> > >> if you told it to display two decimal places.
> > >>
> > >>  Cheers,
> > >>  GaryB-)
> >  

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-04 Thread G.W. Haywood via clamav-users

Hi there,

On Tue, 3 Nov 2020, Gary R. Schmidt wrote:


... I've written C code that is still in use on everything from
80286s to DEC Alphas and Power and SPARC64 and PA-RISC ...


Hehe, I wrote our invoicing, stock control and accounting suite in C
starting around 1986.  Originally it ran under DOS on the Apricot Xi
_and_ it's multi-user _and_ it _still_ runs under plain ordinary DOS,
or FreeDOS, on anything that can run DOS, _and_ it runs under Linux
using exactly the same database.  It all compiles from the same source
using DosBOX and Borland/Zortech C for DOS or gcc natively.  It took a
while to get gcc to OK it, especially the newer (ca. 1990 :) C++ bits.


... none of this has *anything* to do with the original problem ...


Er, quite so.

The problem has already been discussed on the list.  I think Micah has
looked a bit harder than I at the byte counting code, and there may
well be issues which I haven't seen if you change the way that blocks
are counted.  That's why I said my suggestion was untested and YMMV.
But it was a very small job to make the changes and I just gave it a
spin to see what happens.  So now it's still not really what I would
call tested, but it does compile, it runs and it produces results.  I
picked a directory that contains a few fairly large files - at least
they're large compared with anything I'd normally scan with my mail
administrator's hat on - on which to exercise it:

8<--
$ ls -lR /EXPORTS/log/
/EXPORTS/log/:
total 384696
-rw-r- 1 rootadm   5293079 Nov  4 13:51 auth.log
-rw-r- 1 rootadm 191611794 Nov  4 13:51 authpriv.log
drwxr-xr-x 2 _chrony _chrony  4096 Mar 15  2020 chrony
-rw-r- 1 rootadm  95743836 Nov  4 13:51 cron.log
-rw-r- 1 rootadm   9658075 Nov  4 12:54 daemon.log
-rw-r- 1 rootadm   4309854 Nov  2 16:05 kern.log
-rw-r- 1 rootadm  47876683 Nov  4 13:47 mail.log
-rw-r- 1 rootadm673219 Nov  4 13:35 syslog.log
-rw-r- 1 rootadm  38712514 Nov  4 13:51 user.log

/EXPORTS/log/chrony:
total 53816
-rw-r--r-- 1 _chrony _chrony 19844910 Nov  4 13:51 measurements.log
-rw-r--r-- 1 _chrony _chrony 16554307 Nov  4 13:51 statistics.log
-rw-r--r-- 1 _chrony _chrony 18692664 Nov  4 13:51 tracking.log

$ du -bc /EXPORTS/log/
55095977/EXPORTS/log/chrony
449011895   /EXPORTS/log
449011895   total
8<--
$ clamscan -r --debug --verbose --stdout --statistics=pcre  \
  --detect-pua=yes --alert-exceeds-max=yes --max-scantime=0 \
  --max-filesize=500M --max-scansize=500M --disable-cache   \
  /EXPORTS/log > /home/ged/clamscan_EXPORTS_log 2>&1
...
/EXPORTS/log/daemon.log: YARA.Sanesecurity_Spam_test.UNOFFICIAL FOUND
...
--- SCAN SUMMARY ---
Known viruses: 13556894
Engine version: 0.103.0-rc2
Scanned directories: 2
Scanned files: 12
Infected files: 1
Data scanned: 494860005.00 Bytes
Data read: 449076137.00 Bytes (ratio 1.10:1)
Time: 2339.015 sec (38 m 59 s)
Start Date: 2020:11:04 13:58:13
End Date:   2020:11:04 14:37:12
8<--

Notes:

Obviously the printf() statements could be tidied up to print integers,
but scanning 200MByte files seems to be no problem with these changes.

The YARA.Sanesecurity_Spam_test string is indeed found in daemon.log,
apparently I was testing some Yara rules.

As you can see there was a little more data in the logfiles after the
scan but that's to be expected of course as the logs are live.

Some of the files give "Data scanned" values of three or four times
the "Data read" values even though they're plain text.  Thoughts?  I
haven't investigated.

--

73,
Ged.

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-04 Thread Mark Fortescue via clamav-users

Yes.

On 04/11/2020 17:49, Micah Snyder (micasnyd) via clamav-users wrote:

Do you reckon folks will be less confused if it rounds up?

-Micah

On 11/3/20, 1:37 PM, "clamav-users on behalf of Paul Kosinski via clamav-users" 
 wrote:

 If ClamAV always rounded up when counting the number of 16kb blocks,
 then it should be counting at least 0.016384 MB (or 0.015625 MiB) for
 tiny files. By normal rounding rules this should display as 0.02 MB/MiB.


 On Tue, 3 Nov 2020 17:50:18 +
 Mark Fortescue via clamav-users  wrote:

 > Hi all,
 >
 > I would call this a bug. Scanning 1 byte is the same as scanning 1 block.
 >
 > When storing things in blocks is is always important to round up or you
 > get a false impression of reality.
 >
 > You can't store 100 bytes in 0 disk sectors of 128 bytes. It is always 1
 > disk sector.
 >
 > Can you not just round up by adding (BlockSize - 1) bytes when setting
 > the block variables ?
 >
 > Regards
 >   Mark.
 >
 > On 03/11/2020 16:07, Paul Kosinski via clamav-users wrote:
 > > "This is a display problem, not a storage problem."
 > >
 > > I disagree. When the counts in info.blocks and info.rblocks are counts
 > > of 16kb *blocks*, keeping precise track of the reading and scanning of
 > > small files is impossible, no matter how clever the display code is.
 > >
 > >
 > >
 > > On Tue, 3 Nov 2020 17:44:18 +1100
 > > "Gary R. Schmidt"  wrote:
 > >
 > >> On 03/11/2020 16:00, Paul Kosinski via clamav-users wrote:
 > >>> "(don't you love C?)"
 > >>>
 > >>> I have never understood why the originators of C didn't give integers
 > >>> explicit widths in bits: their scheme made C code often non-portable.
 > >>>
 > >> Because C is intended to be very, very close to the machine
 > >> architecture, only a step or tow above assembler, or doing the
 > >> bit-twiddling by hand.
 > >>
 > >>> When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 
bits
 > >>> while longs were 64 (unlike "standard" C). This made Alpha C code not
 > >>> portable to lesser CPUs. On the other hand, when I wrote C on DOS for
 > >>> the IBM PC in the late 1980s, ints were only 8 bits! It took some 
time
 > >>> to figure out why my C-compliant code failed so badly. In spite of 
all
 > >>> that, having started programming before C was invented, I can safely
 > >>> say that C is better than its predecessors for software like ClamAV.
 > >>>
 > >> Uh, not a good example, I've written C code that is still in use on
 > >> everything from 80286s (yes, Virginia, there are people who keep them
 > >> alive, not just because they're cheap, sometimes just because they
 > >> *can*) to DEC Alphas and Power and SPARC64 and PA-RISC, it's just a
 > >> matter of knowing what you are doing, and sticking to it...
 > >>
 > >>> P.S. Good code these days tends to use typedefs defining things like
 > >>> int32, uint64 etc. A shame the original ClamAV coders didn't do that.
 > >>>
 > >> And none of this has *anything* to do with the original problem - 
seeing
 > >> 0 when the value is 0.01, or so.
 > >>
 > >> This is a display problem, not a storage problem.  You could declare
 > >> something as PIC(999.99) and you will still only see 0
 > >> if you told it to display two decimal places.
 > >>
 > >>  Cheers,
 > >>  GaryB-)
 >

 ___

 clamav-users mailing list
 clamav-users@lists.clamav.net
 https://lists.clamav.net/mailman/listinfo/clamav-users


 Help us build a comprehensive ClamAV guide:
 https://github.com/vrtadmin/clamav-faq

 http://www.clamav.net/contact.html#ml


___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml



___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-04 Thread Micah Snyder (micasnyd) via clamav-users
Do you reckon folks will be less confused if it rounds up?

-Micah

On 11/3/20, 1:37 PM, "clamav-users on behalf of Paul Kosinski via 
clamav-users"  wrote:

If ClamAV always rounded up when counting the number of 16kb blocks,
then it should be counting at least 0.016384 MB (or 0.015625 MiB) for
tiny files. By normal rounding rules this should display as 0.02 MB/MiB.


On Tue, 3 Nov 2020 17:50:18 +
Mark Fortescue via clamav-users  wrote:

> Hi all,
> 
> I would call this a bug. Scanning 1 byte is the same as scanning 1 block.
> 
> When storing things in blocks is is always important to round up or you 
> get a false impression of reality.
> 
> You can't store 100 bytes in 0 disk sectors of 128 bytes. It is always 1 
> disk sector.
> 
> Can you not just round up by adding (BlockSize - 1) bytes when setting 
> the block variables ?
> 
> Regards
>   Mark.
> 
> On 03/11/2020 16:07, Paul Kosinski via clamav-users wrote:
> > "This is a display problem, not a storage problem."
> > 
> > I disagree. When the counts in info.blocks and info.rblocks are counts
> > of 16kb *blocks*, keeping precise track of the reading and scanning of
> > small files is impossible, no matter how clever the display code is.
> > 
> > 
> > 
> > On Tue, 3 Nov 2020 17:44:18 +1100
> > "Gary R. Schmidt"  wrote:
> >   
> >> On 03/11/2020 16:00, Paul Kosinski via clamav-users wrote:  
> >>> "(don't you love C?)"
> >>>
> >>> I have never understood why the originators of C didn't give integers
> >>> explicit widths in bits: their scheme made C code often non-portable.
> >>>  
> >> Because C is intended to be very, very close to the machine
> >> architecture, only a step or tow above assembler, or doing the
> >> bit-twiddling by hand.
> >>  
> >>> When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 
bits
> >>> while longs were 64 (unlike "standard" C). This made Alpha C code not
> >>> portable to lesser CPUs. On the other hand, when I wrote C on DOS for
> >>> the IBM PC in the late 1980s, ints were only 8 bits! It took some time
> >>> to figure out why my C-compliant code failed so badly. In spite of all
> >>> that, having started programming before C was invented, I can safely
> >>> say that C is better than its predecessors for software like ClamAV.
> >>>  
> >> Uh, not a good example, I've written C code that is still in use on
> >> everything from 80286s (yes, Virginia, there are people who keep them
> >> alive, not just because they're cheap, sometimes just because they
> >> *can*) to DEC Alphas and Power and SPARC64 and PA-RISC, it's just a
> >> matter of knowing what you are doing, and sticking to it...
> >>  
> >>> P.S. Good code these days tends to use typedefs defining things like
> >>> int32, uint64 etc. A shame the original ClamAV coders didn't do that.
> >>>  
> >> And none of this has *anything* to do with the original problem - 
seeing
> >> 0 when the value is 0.01, or so.
> >>
> >> This is a display problem, not a storage problem.  You could declare
> >> something as PIC(999.99) and you will still only see 0
> >> if you told it to display two decimal places.
> >>
> >>Cheers,
> >>GaryB-)  
>

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-04 Thread Micah Snyder (micasnyd) via clamav-users
It’s an unfortunate coincidence that you picked an MP3. ClamAV explicitly 
ignores MP3 files and a handful of other types.  I don’t have the backstory as 
to why though I’m certain it’s for a good reason.

-Micah

From: clamav-users  on behalf of Ankur 
Sharma via clamav-users 
Reply-To: ClamAV users ML 
Date: Tuesday, November 3, 2020 at 12:41 PM
To: ClamAV users ML 
Cc: Ankur Sharma 
Subject: Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

Thanks a lot for all your inputs.

Further during the test I tried to scan a ".mp3" file of size 5.04MB. The 
returned log mentions the number of Scanned files as 1, but when I see the Data 
Scanned - it is 0.00 MB. Does ClamAV support scanning MP3 files? Has anyone 
tried scanning MP3 files?

{'/tmp/bucket-file-upload/2copy.mp3': 'OK', 'Known viruses': '8931107', 'Engine 
version': '0.102.4', 'Scanned directories': '0', 'Scanned files': '1', 
'Infected files': '0', 'Data scanned': '0.00 MB', 'Data read': '5.04 MB (ratio 
0.00:1)', 'Time': '22.727 sec (0 m 22 s)'}

I am using "clamscan" inside a Lambda function to scan the file which is 
uploaded to an AWS S3 bucket.

regards
Ankur

On Wed, Nov 4, 2020 at 4:51 AM Mark Fortescue via clamav-users 
mailto:clamav-users@lists.clamav.net>> wrote:
Hi all,

I would call this a bug. Scanning 1 byte is the same as scanning 1 block.

When storing things in blocks is is always important to round up or you
get a false impression of reality.

You can't store 100 bytes in 0 disk sectors of 128 bytes. It is always 1
disk sector.

Can you not just round up by adding (BlockSize - 1) bytes when setting
the block variables ?

Regards
Mark.

On 03/11/2020 16:07, Paul Kosinski via clamav-users wrote:
> "This is a display problem, not a storage problem."
>
> I disagree. When the counts in info.blocks and info.rblocks are counts
> of 16kb *blocks*, keeping precise track of the reading and scanning of
> small files is impossible, no matter how clever the display code is.
>
>
>
> On Tue, 3 Nov 2020 17:44:18 +1100
> "Gary R. Schmidt" mailto:grschm...@acm.org>> wrote:
>
>> On 03/11/2020 16:00, Paul Kosinski via clamav-users wrote:
>>> "(don't you love C?)"
>>>
>>> I have never understood why the originators of C didn't give integers
>>> explicit widths in bits: their scheme made C code often non-portable.
>>>
>> Because C is intended to be very, very close to the machine
>> architecture, only a step or tow above assembler, or doing the
>> bit-twiddling by hand.
>>
>>> When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 bits
>>> while longs were 64 (unlike "standard" C). This made Alpha C code not
>>> portable to lesser CPUs. On the other hand, when I wrote C on DOS for
>>> the IBM PC in the late 1980s, ints were only 8 bits! It took some time
>>> to figure out why my C-compliant code failed so badly. In spite of all
>>> that, having started programming before C was invented, I can safely
>>> say that C is better than its predecessors for software like ClamAV.
>>>
>> Uh, not a good example, I've written C code that is still in use on
>> everything from 80286s (yes, Virginia, there are people who keep them
>> alive, not just because they're cheap, sometimes just because they
>> *can*) to DEC Alphas and Power and SPARC64 and PA-RISC, it's just a
>> matter of knowing what you are doing, and sticking to it...
>>
>>> P.S. Good code these days tends to use typedefs defining things like
>>> int32, uint64 etc. A shame the original ClamAV coders didn't do that.
>>>
>> And none of this has *anything* to do with the original problem - seeing
>> 0 when the value is 0.01, or so.
>>
>> This is a display problem, not a storage problem.  You could declare
>> something as PIC(999.99) and you will still only see 0
>> if you told it to display two decimal places.
>>
>>  Cheers,
>>  GaryB-)
>
>
> ___
>
> clamav-users mailing list
> clamav-users@lists.clamav.net<mailto:clamav-users@lists.clamav.net>
> https://lists.clamav.net/mailman/listinfo/clamav-users
>
>
> Help us build a comprehensive ClamAV guide:
> https://github.com/vrtadmin/clamav-faq
>
> http://www.clamav.net/contact.html#ml
>

___

clamav-users mailing list
clamav-users@lists.clamav.net<mailto:clamav-users@lists.clamav.net>
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


--
regards
Ankur
+61481141085

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-03 Thread Paul Kosinski via clamav-users
If ClamAV always rounded up when counting the number of 16kb blocks,
then it should be counting at least 0.016384 MB (or 0.015625 MiB) for
tiny files. By normal rounding rules this should display as 0.02 MB/MiB.


On Tue, 3 Nov 2020 17:50:18 +
Mark Fortescue via clamav-users  wrote:

> Hi all,
> 
> I would call this a bug. Scanning 1 byte is the same as scanning 1 block.
> 
> When storing things in blocks is is always important to round up or you 
> get a false impression of reality.
> 
> You can't store 100 bytes in 0 disk sectors of 128 bytes. It is always 1 
> disk sector.
> 
> Can you not just round up by adding (BlockSize - 1) bytes when setting 
> the block variables ?
> 
> Regards
>   Mark.
> 
> On 03/11/2020 16:07, Paul Kosinski via clamav-users wrote:
> > "This is a display problem, not a storage problem."
> > 
> > I disagree. When the counts in info.blocks and info.rblocks are counts
> > of 16kb *blocks*, keeping precise track of the reading and scanning of
> > small files is impossible, no matter how clever the display code is.
> > 
> > 
> > 
> > On Tue, 3 Nov 2020 17:44:18 +1100
> > "Gary R. Schmidt"  wrote:
> >   
> >> On 03/11/2020 16:00, Paul Kosinski via clamav-users wrote:  
> >>> "(don't you love C?)"
> >>>
> >>> I have never understood why the originators of C didn't give integers
> >>> explicit widths in bits: their scheme made C code often non-portable.
> >>>  
> >> Because C is intended to be very, very close to the machine
> >> architecture, only a step or tow above assembler, or doing the
> >> bit-twiddling by hand.
> >>  
> >>> When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 bits
> >>> while longs were 64 (unlike "standard" C). This made Alpha C code not
> >>> portable to lesser CPUs. On the other hand, when I wrote C on DOS for
> >>> the IBM PC in the late 1980s, ints were only 8 bits! It took some time
> >>> to figure out why my C-compliant code failed so badly. In spite of all
> >>> that, having started programming before C was invented, I can safely
> >>> say that C is better than its predecessors for software like ClamAV.
> >>>  
> >> Uh, not a good example, I've written C code that is still in use on
> >> everything from 80286s (yes, Virginia, there are people who keep them
> >> alive, not just because they're cheap, sometimes just because they
> >> *can*) to DEC Alphas and Power and SPARC64 and PA-RISC, it's just a
> >> matter of knowing what you are doing, and sticking to it...
> >>  
> >>> P.S. Good code these days tends to use typedefs defining things like
> >>> int32, uint64 etc. A shame the original ClamAV coders didn't do that.
> >>>  
> >> And none of this has *anything* to do with the original problem - seeing
> >> 0 when the value is 0.01, or so.
> >>
> >> This is a display problem, not a storage problem.  You could declare
> >> something as PIC(999.99) and you will still only see 0
> >> if you told it to display two decimal places.
> >>
> >>Cheers,
> >>GaryB-)  
>

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-03 Thread Ankur Sharma via clamav-users
Thanks a lot for all your inputs.

Further during the test I tried to scan a ".mp3" file of size 5.04MB. The
returned log mentions the number of Scanned files as 1, but when I see the
Data Scanned - it is 0.00 MB. Does ClamAV support scanning MP3 files? Has
anyone tried scanning MP3 files?

{'/tmp/bucket-file-upload/2copy.mp3': 'OK', 'Known viruses': '8931107',
'Engine version': '0.102.4', 'Scanned directories': '0', 'Scanned files':
'1', 'Infected files': '0', 'Data scanned': '0.00 MB', 'Data read': '5.04
MB (ratio 0.00:1)', 'Time': '22.727 sec (0 m 22 s)'}

I am using "clamscan" inside a Lambda function to scan the file which is
uploaded to an AWS S3 bucket.

regards
Ankur

On Wed, Nov 4, 2020 at 4:51 AM Mark Fortescue via clamav-users <
clamav-users@lists.clamav.net> wrote:

> Hi all,
>
> I would call this a bug. Scanning 1 byte is the same as scanning 1 block.
>
> When storing things in blocks is is always important to round up or you
> get a false impression of reality.
>
> You can't store 100 bytes in 0 disk sectors of 128 bytes. It is always 1
> disk sector.
>
> Can you not just round up by adding (BlockSize - 1) bytes when setting
> the block variables ?
>
> Regards
> Mark.
>
> On 03/11/2020 16:07, Paul Kosinski via clamav-users wrote:
> > "This is a display problem, not a storage problem."
> >
> > I disagree. When the counts in info.blocks and info.rblocks are counts
> > of 16kb *blocks*, keeping precise track of the reading and scanning of
> > small files is impossible, no matter how clever the display code is.
> >
> >
> >
> > On Tue, 3 Nov 2020 17:44:18 +1100
> > "Gary R. Schmidt"  wrote:
> >
> >> On 03/11/2020 16:00, Paul Kosinski via clamav-users wrote:
> >>> "(don't you love C?)"
> >>>
> >>> I have never understood why the originators of C didn't give integers
> >>> explicit widths in bits: their scheme made C code often non-portable.
> >>>
> >> Because C is intended to be very, very close to the machine
> >> architecture, only a step or tow above assembler, or doing the
> >> bit-twiddling by hand.
> >>
> >>> When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 bits
> >>> while longs were 64 (unlike "standard" C). This made Alpha C code not
> >>> portable to lesser CPUs. On the other hand, when I wrote C on DOS for
> >>> the IBM PC in the late 1980s, ints were only 8 bits! It took some time
> >>> to figure out why my C-compliant code failed so badly. In spite of all
> >>> that, having started programming before C was invented, I can safely
> >>> say that C is better than its predecessors for software like ClamAV.
> >>>
> >> Uh, not a good example, I've written C code that is still in use on
> >> everything from 80286s (yes, Virginia, there are people who keep them
> >> alive, not just because they're cheap, sometimes just because they
> >> *can*) to DEC Alphas and Power and SPARC64 and PA-RISC, it's just a
> >> matter of knowing what you are doing, and sticking to it...
> >>
> >>> P.S. Good code these days tends to use typedefs defining things like
> >>> int32, uint64 etc. A shame the original ClamAV coders didn't do that.
> >>>
> >> And none of this has *anything* to do with the original problem - seeing
> >> 0 when the value is 0.01, or so.
> >>
> >> This is a display problem, not a storage problem.  You could declare
> >> something as PIC(999.99) and you will still only see 0
> >> if you told it to display two decimal places.
> >>
> >>  Cheers,
> >>  GaryB-)
> >
> >
> > ___
> >
> > clamav-users mailing list
> > clamav-users@lists.clamav.net
> > https://lists.clamav.net/mailman/listinfo/clamav-users
> >
> >
> > Help us build a comprehensive ClamAV guide:
> > https://github.com/vrtadmin/clamav-faq
> >
> > http://www.clamav.net/contact.html#ml
> >
>
> ___
>
> clamav-users mailing list
> clamav-users@lists.clamav.net
> https://lists.clamav.net/mailman/listinfo/clamav-users
>
>
> Help us build a comprehensive ClamAV guide:
> https://github.com/vrtadmin/clamav-faq
>
> http://www.clamav.net/contact.html#ml
>


-- 
regards
Ankur
+61481141085

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-03 Thread Mark Fortescue via clamav-users

Hi all,

I would call this a bug. Scanning 1 byte is the same as scanning 1 block.

When storing things in blocks is is always important to round up or you 
get a false impression of reality.


You can't store 100 bytes in 0 disk sectors of 128 bytes. It is always 1 
disk sector.


Can you not just round up by adding (BlockSize - 1) bytes when setting 
the block variables ?


Regards
Mark.

On 03/11/2020 16:07, Paul Kosinski via clamav-users wrote:

"This is a display problem, not a storage problem."

I disagree. When the counts in info.blocks and info.rblocks are counts
of 16kb *blocks*, keeping precise track of the reading and scanning of
small files is impossible, no matter how clever the display code is.



On Tue, 3 Nov 2020 17:44:18 +1100
"Gary R. Schmidt"  wrote:


On 03/11/2020 16:00, Paul Kosinski via clamav-users wrote:

"(don't you love C?)"

I have never understood why the originators of C didn't give integers
explicit widths in bits: their scheme made C code often non-portable.
   

Because C is intended to be very, very close to the machine
architecture, only a step or tow above assembler, or doing the
bit-twiddling by hand.


When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 bits
while longs were 64 (unlike "standard" C). This made Alpha C code not
portable to lesser CPUs. On the other hand, when I wrote C on DOS for
the IBM PC in the late 1980s, ints were only 8 bits! It took some time
to figure out why my C-compliant code failed so badly. In spite of all
that, having started programming before C was invented, I can safely
say that C is better than its predecessors for software like ClamAV.
   

Uh, not a good example, I've written C code that is still in use on
everything from 80286s (yes, Virginia, there are people who keep them
alive, not just because they're cheap, sometimes just because they
*can*) to DEC Alphas and Power and SPARC64 and PA-RISC, it's just a
matter of knowing what you are doing, and sticking to it...


P.S. Good code these days tends to use typedefs defining things like
int32, uint64 etc. A shame the original ClamAV coders didn't do that.
   

And none of this has *anything* to do with the original problem - seeing
0 when the value is 0.01, or so.

This is a display problem, not a storage problem.  You could declare
something as PIC(999.99) and you will still only see 0
if you told it to display two decimal places.

Cheers,
GaryB-)



___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml



___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-03 Thread Paul Kosinski via clamav-users
"This is a display problem, not a storage problem."

I disagree. When the counts in info.blocks and info.rblocks are counts
of 16kb *blocks*, keeping precise track of the reading and scanning of
small files is impossible, no matter how clever the display code is.



On Tue, 3 Nov 2020 17:44:18 +1100
"Gary R. Schmidt"  wrote:

> On 03/11/2020 16:00, Paul Kosinski via clamav-users wrote:
> > "(don't you love C?)"
> > 
> > I have never understood why the originators of C didn't give integers
> > explicit widths in bits: their scheme made C code often non-portable.
> >   
> Because C is intended to be very, very close to the machine 
> architecture, only a step or tow above assembler, or doing the 
> bit-twiddling by hand.
> 
> > When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 bits
> > while longs were 64 (unlike "standard" C). This made Alpha C code not
> > portable to lesser CPUs. On the other hand, when I wrote C on DOS for
> > the IBM PC in the late 1980s, ints were only 8 bits! It took some time
> > to figure out why my C-compliant code failed so badly. In spite of all
> > that, having started programming before C was invented, I can safely
> > say that C is better than its predecessors for software like ClamAV.
> >   
> Uh, not a good example, I've written C code that is still in use on 
> everything from 80286s (yes, Virginia, there are people who keep them 
> alive, not just because they're cheap, sometimes just because they 
> *can*) to DEC Alphas and Power and SPARC64 and PA-RISC, it's just a 
> matter of knowing what you are doing, and sticking to it...
> 
> > P.S. Good code these days tends to use typedefs defining things like
> > int32, uint64 etc. A shame the original ClamAV coders didn't do that.
> >   
> And none of this has *anything* to do with the original problem - seeing 
> 0 when the value is 0.01, or so.
> 
> This is a display problem, not a storage problem.  You could declare 
> something as PIC(999.99) and you will still only see 0 
> if you told it to display two decimal places.
> 
>   Cheers,
>   GaryB-)


___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-02 Thread Gary R. Schmidt

On 03/11/2020 16:00, Paul Kosinski via clamav-users wrote:

"(don't you love C?)"

I have never understood why the originators of C didn't give integers
explicit widths in bits: their scheme made C code often non-portable.

Because C is intended to be very, very close to the machine 
architecture, only a step or tow above assembler, or doing the 
bit-twiddling by hand.



When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 bits
while longs were 64 (unlike "standard" C). This made Alpha C code not
portable to lesser CPUs. On the other hand, when I wrote C on DOS for
the IBM PC in the late 1980s, ints were only 8 bits! It took some time
to figure out why my C-compliant code failed so badly. In spite of all
that, having started programming before C was invented, I can safely
say that C is better than its predecessors for software like ClamAV.

Uh, not a good example, I've written C code that is still in use on 
everything from 80286s (yes, Virginia, there are people who keep them 
alive, not just because they're cheap, sometimes just because they 
*can*) to DEC Alphas and Power and SPARC64 and PA-RISC, it's just a 
matter of knowing what you are doing, and sticking to it...



P.S. Good code these days tends to use typedefs defining things like
int32, uint64 etc. A shame the original ClamAV coders didn't do that.

And none of this has *anything* to do with the original problem - seeing 
0 when the value is 0.01, or so.


This is a display problem, not a storage problem.  You could declare 
something as PIC(999.99) and you will still only see 0 
if you told it to display two decimal places.


Cheers,
GaryB-)

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-02 Thread Paul Kosinski via clamav-users
"(don't you love C?)"

I have never understood why the originators of C didn't give integers
explicit widths in bits: their scheme made C code often non-portable.

When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 bits
while longs were 64 (unlike "standard" C). This made Alpha C code not
portable to lesser CPUs. On the other hand, when I wrote C on DOS for
the IBM PC in the late 1980s, ints were only 8 bits! It took some time
to figure out why my C-compliant code failed so badly. In spite of all
that, having started programming before C was invented, I can safely
say that C is better than its predecessors for software like ClamAV.

P.S. Good code these days tends to use typedefs defining things like
int32, uint64 etc. A shame the original ClamAV coders didn't do that.



On Tue, 3 Nov 2020 01:53:33 +
"Micah Snyder (micasnyd)"  wrote:

> I hadn't really looked at the code. You raise a good point.
> 
> Changing it isn't super simple.  The info.blocks variable is passed through 
> cli_scandesc_callback() and scan_common() where it's placed into the scan 
> context.  When data is scanned, the amount scanned is divided by 
> CL_COUNT_PRECISION (also found in clamav.h), which is what you multiply the 
> number by to get the value in bytes. Provided that all downstream 
> applications use CL_COUNT_PRECISION as clamscan does, we could shrink the 
> count precision from 4k to something lower, but that would also decrease the 
> max amount of data which could be scanned.  
> 
> If the variable were a uint64_t, that'd probably be fine... but it's an 
> unsigned long int... aka maybe 4 bytes or maybe 8 bytes (don't you love C?).  
> On systems where an unsigned long is 4 bytes, then that'd cap the scan limit 
> at 4GB.  Changing the variable to be an uint64_t would be "best", but it 
> would be a non-backwards compatible change to the API which is very much not 
> worth it. 
> 
> Sigh :-/
> 
> > -Original Message-
> > From: clamav-users  On Behalf Of
> > Paul Kosinski via clamav-users
> > Sent: Monday, November 2, 2020 5:23 PM
> > To: clamav-users@lists.clamav.net
> > Cc: Paul Kosinski 
> > Subject: Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned
> > 
> > Can this really be done? I was looking at the code referred to by G.W.
> > Haywood, and I see that it uses "info.blocks" and "info.rblocks".
> > Looking at the definitions in "clamav-0.103.0/clamscan/", I see the
> > following:
> > 
> > struct s_info {
> > unsigned int sigs; /* number of signatures */
> > unsigned int dirs; /* number of scanned directories */
> > unsigned int files;/* number of scanned files */
> > unsigned int ifiles;   /* number of infected files */
> > unsigned int errors;   /* number of errors */
> > unsigned long int blocks;  /* number of *scanned* 16kb blocks */
> > unsigned long int rblocks; /* number of *read* 16kb blocks */ };
> > 
> > This suggests that the counts for "scanned" and "read" are not really byte
> > counts, and EICAR's 68 bytes would always be recorded as 0 (if normal
> > rounding rules are applied).
> > 
> > 
> > 
> > On Mon, 2 Nov 2020 23:59:20 +
> > "Micah Snyder \(micasnyd\) via clamav-users" 
> > wrote:
> >   
> > > I agree.  We already have some logic in freshclam to convert bytes to 
> > > human  
> > readable B / KiB / MiB / GiB format.  It should be pretty much a copypaste
> > effort to improve the data scanned/read output.  
> > >
> > > -Micah
> > >
> > > On 11/2/20, 9:47 AM, "clamav-users on behalf of G.W. Haywood via clamav- 
> > >  
> > users"  > us...@lists.clamav.net> wrote:
> > >
> > > Hi there,
> > >
> > > On Mon, 2 Nov 2020, Paul Kosinski via clamav-users wrote:
> > >  
> > > > ... I still think it is a bad message that should be fixed.  
> > >
> > > +1
> > >
> > > If you want to try a very quick and dirty tweak to get more precise
> > > numbers, change the value of
> > >
> > > 1) CL_COUNT_PRECISION in .../libclamav/clamav.h from 4096 to 1
> > >
> > > 2) replace '1024' with '1' in four places in clamscan/clamscan.c
> > >
> > > 3) change 'MB' to 'Bytes' in two places in clamscan/clamscan.c and
> > >
> > > 4) rebuild.
> > >
> > > 
> > > 8<--
> > >  

Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-02 Thread Micah Snyder (micasnyd) via clamav-users
I hadn't really looked at the code. You raise a good point.

Changing it isn't super simple.  The info.blocks variable is passed through 
cli_scandesc_callback() and scan_common() where it's placed into the scan 
context.  When data is scanned, the amount scanned is divided by 
CL_COUNT_PRECISION (also found in clamav.h), which is what you multiply the 
number by to get the value in bytes. Provided that all downstream applications 
use CL_COUNT_PRECISION as clamscan does, we could shrink the count precision 
from 4k to something lower, but that would also decrease the max amount of data 
which could be scanned.  

If the variable were a uint64_t, that'd probably be fine... but it's an 
unsigned long int... aka maybe 4 bytes or maybe 8 bytes (don't you love C?).  
On systems where an unsigned long is 4 bytes, then that'd cap the scan limit at 
4GB.  Changing the variable to be an uint64_t would be "best", but it would be 
a non-backwards compatible change to the API which is very much not worth it. 

Sigh :-/

> -Original Message-
> From: clamav-users  On Behalf Of
> Paul Kosinski via clamav-users
> Sent: Monday, November 2, 2020 5:23 PM
> To: clamav-users@lists.clamav.net
> Cc: Paul Kosinski 
> Subject: Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned
> 
> Can this really be done? I was looking at the code referred to by G.W.
> Haywood, and I see that it uses "info.blocks" and "info.rblocks".
> Looking at the definitions in "clamav-0.103.0/clamscan/", I see the
> following:
> 
> struct s_info {
> unsigned int sigs; /* number of signatures */
> unsigned int dirs; /* number of scanned directories */
> unsigned int files;/* number of scanned files */
> unsigned int ifiles;   /* number of infected files */
> unsigned int errors;   /* number of errors */
> unsigned long int blocks;  /* number of *scanned* 16kb blocks */
> unsigned long int rblocks; /* number of *read* 16kb blocks */ };
> 
> This suggests that the counts for "scanned" and "read" are not really byte
> counts, and EICAR's 68 bytes would always be recorded as 0 (if normal
> rounding rules are applied).
> 
> 
> 
> On Mon, 2 Nov 2020 23:59:20 +
> "Micah Snyder \(micasnyd\) via clamav-users" 
> wrote:
> 
> > I agree.  We already have some logic in freshclam to convert bytes to human
> readable B / KiB / MiB / GiB format.  It should be pretty much a copypaste
> effort to improve the data scanned/read output.
> >
> > -Micah
> >
> > On 11/2/20, 9:47 AM, "clamav-users on behalf of G.W. Haywood via clamav-
> users"  us...@lists.clamav.net> wrote:
> >
> > Hi there,
> >
> > On Mon, 2 Nov 2020, Paul Kosinski via clamav-users wrote:
> >
> > > ... I still think it is a bad message that should be fixed.
> >
> > +1
> >
> > If you want to try a very quick and dirty tweak to get more precise
> > numbers, change the value of
> >
> > 1) CL_COUNT_PRECISION in .../libclamav/clamav.h from 4096 to 1
> >
> > 2) replace '1024' with '1' in four places in clamscan/clamscan.c
> >
> > 3) change 'MB' to 'Bytes' in two places in clamscan/clamscan.c and
> >
> > 4) rebuild.
> >
> > 8<--
> > ~/clamav-0.103.0-rc2: $ grep -C3 -r CL_COUNT_PRECISION clamscan
> libclamav | ...
> > ...
> > ...
> > clamscan/clamscan.c:mb = info.blocks * (CL_COUNT_PRECISION /
> 1024) / 1024.0;
> > clamscan/clamscan.c:logg("Data scanned: %2.2lf MB\n", mb);
> > clamscan/clamscan.c:rmb = info.rblocks * (CL_COUNT_PRECISION /
> 1024) / 1024.0;
> > clamscan/clamscan.c:logg("Data read: %2.2lf MB (ratio 
> > %.2f:1)\n",
> rmb, info.rblocks ? (double)info.blocks / (double)info.rblocks : 0);
> > ...
> > ...
> > libclamav/clamav.h:#define CL_COUNT_PRECISION 4096
> > ...
> > ...
> >
> > 8<
> > --
> >
> > This is untested, YMMV.  Obviously, if you're skilled in the art, this
> > can be done better.  Note that 'MB' should in any case be 'MiB' as the
> > values printed are the counts divided by 2^20 and not by 10^6.
> >
> > --
> >
> > 73,
> > Ged.
> 
> ___
> 
> clamav-users mailing list
> clamav-users@lists.clamav.net
> https://lists.clamav.net/mailman/listinfo/clamav-users
> 
> 
> Help us build a comprehensive ClamAV guide:
> https://github.com/vrtadmin/clamav-faq
> 
> http://www.clamav.net/contact.html#ml

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-02 Thread Paul Kosinski via clamav-users
Can this really be done? I was looking at the code referred to by G.W.
Haywood, and I see that it uses "info.blocks" and "info.rblocks".
Looking at the definitions in "clamav-0.103.0/clamscan/", I see the
following:

struct s_info {
unsigned int sigs; /* number of signatures */
unsigned int dirs; /* number of scanned directories */
unsigned int files;/* number of scanned files */
unsigned int ifiles;   /* number of infected files */
unsigned int errors;   /* number of errors */
unsigned long int blocks;  /* number of *scanned* 16kb blocks */
unsigned long int rblocks; /* number of *read* 16kb blocks */
};

This suggests that the counts for "scanned" and "read" are not really
byte counts, and EICAR's 68 bytes would always be recorded as 0 (if
normal rounding rules are applied).



On Mon, 2 Nov 2020 23:59:20 +
"Micah Snyder \(micasnyd\) via clamav-users"  
wrote:

> I agree.  We already have some logic in freshclam to convert bytes to human 
> readable B / KiB / MiB / GiB format.  It should be pretty much a copypaste 
> effort to improve the data scanned/read output. 
> 
> -Micah
> 
> On 11/2/20, 9:47 AM, "clamav-users on behalf of G.W. Haywood via 
> clamav-users"  clamav-users@lists.clamav.net> wrote:
> 
> Hi there,
> 
> On Mon, 2 Nov 2020, Paul Kosinski via clamav-users wrote:
> 
> > ... I still think it is a bad message that should be fixed.  
> 
> +1
> 
> If you want to try a very quick and dirty tweak to get more precise
> numbers, change the value of
> 
> 1) CL_COUNT_PRECISION in .../libclamav/clamav.h from 4096 to 1
> 
> 2) replace '1024' with '1' in four places in clamscan/clamscan.c
> 
> 3) change 'MB' to 'Bytes' in two places in clamscan/clamscan.c and
> 
> 4) rebuild.
> 
> 8<--
> ~/clamav-0.103.0-rc2: $ grep -C3 -r CL_COUNT_PRECISION clamscan libclamav 
> | ...
> ...
> ...
> clamscan/clamscan.c:mb = info.blocks * (CL_COUNT_PRECISION / 
> 1024) / 1024.0;
> clamscan/clamscan.c:logg("Data scanned: %2.2lf MB\n", mb);
> clamscan/clamscan.c:rmb = info.rblocks * (CL_COUNT_PRECISION / 
> 1024) / 1024.0;
> clamscan/clamscan.c:logg("Data read: %2.2lf MB (ratio %.2f:1)\n", 
> rmb, info.rblocks ? (double)info.blocks / (double)info.rblocks : 0);
> ...
> ...
> libclamav/clamav.h:#define CL_COUNT_PRECISION 4096
> ...
> ...
> 8<--
> 
> This is untested, YMMV.  Obviously, if you're skilled in the art, this
> can be done better.  Note that 'MB' should in any case be 'MiB' as the
> values printed are the counts divided by 2^20 and not by 10^6.
> 
> -- 
> 
> 73,
> Ged.

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-02 Thread Micah Snyder (micasnyd) via clamav-users
I agree.  We already have some logic in freshclam to convert bytes to human 
readable B / KiB / MiB / GiB format.  It should be pretty much a copypaste 
effort to improve the data scanned/read output. 

-Micah

On 11/2/20, 9:47 AM, "clamav-users on behalf of G.W. Haywood via clamav-users" 
 wrote:

Hi there,

On Mon, 2 Nov 2020, Paul Kosinski via clamav-users wrote:

> ... I still think it is a bad message that should be fixed.

+1

If you want to try a very quick and dirty tweak to get more precise
numbers, change the value of

1) CL_COUNT_PRECISION in .../libclamav/clamav.h from 4096 to 1

2) replace '1024' with '1' in four places in clamscan/clamscan.c

3) change 'MB' to 'Bytes' in two places in clamscan/clamscan.c and

4) rebuild.

8<--
~/clamav-0.103.0-rc2: $ grep -C3 -r CL_COUNT_PRECISION clamscan libclamav | 
...
...
...
clamscan/clamscan.c:mb = info.blocks * (CL_COUNT_PRECISION / 1024) 
/ 1024.0;
clamscan/clamscan.c:logg("Data scanned: %2.2lf MB\n", mb);
clamscan/clamscan.c:rmb = info.rblocks * (CL_COUNT_PRECISION / 
1024) / 1024.0;
clamscan/clamscan.c:logg("Data read: %2.2lf MB (ratio %.2f:1)\n", 
rmb, info.rblocks ? (double)info.blocks / (double)info.rblocks : 0);
...
...
libclamav/clamav.h:#define CL_COUNT_PRECISION 4096
...
...
8<--

This is untested, YMMV.  Obviously, if you're skilled in the art, this
can be done better.  Note that 'MB' should in any case be 'MiB' as the
values printed are the counts divided by 2^20 and not by 10^6.

-- 

73,
Ged.

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-02 Thread G.W. Haywood via clamav-users

Hi there,

On Mon, 2 Nov 2020, Paul Kosinski via clamav-users wrote:


... I still think it is a bad message that should be fixed.


+1

If you want to try a very quick and dirty tweak to get more precise
numbers, change the value of

1) CL_COUNT_PRECISION in .../libclamav/clamav.h from 4096 to 1

2) replace '1024' with '1' in four places in clamscan/clamscan.c

3) change 'MB' to 'Bytes' in two places in clamscan/clamscan.c and

4) rebuild.

8<--
~/clamav-0.103.0-rc2: $ grep -C3 -r CL_COUNT_PRECISION clamscan libclamav | ...
...
...
clamscan/clamscan.c:mb = info.blocks * (CL_COUNT_PRECISION / 1024) / 
1024.0;
clamscan/clamscan.c:logg("Data scanned: %2.2lf MB\n", mb);
clamscan/clamscan.c:rmb = info.rblocks * (CL_COUNT_PRECISION / 1024) / 
1024.0;
clamscan/clamscan.c:logg("Data read: %2.2lf MB (ratio %.2f:1)\n", rmb, 
info.rblocks ? (double)info.blocks / (double)info.rblocks : 0);
...
...
libclamav/clamav.h:#define CL_COUNT_PRECISION 4096
...
...
8<--

This is untested, YMMV.  Obviously, if you're skilled in the art, this
can be done better.  Note that 'MB' should in any case be 'MiB' as the
values printed are the counts divided by 2^20 and not by 10^6.

--

73,
Ged.

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-02 Thread Paul Kosinski via clamav-users
When I first saw this message, I quickly concluded it was a roundoff
behavior. But I still think it is a bad message that should be fixed.

First, most file managers that only display file sizes in "human
readable" form, still display a non-zero size for small files. Second,
it is not logically impossible (in principle) that a file be enumerated
while not having any data read or scanned.

Third, and perhaps most important, it can make users new to ClamAV
doubt its design, implementation or reliability.



On Sun, 1 Nov 2020 19:53:18 -0800
Al Varnell via clamav-users  wrote:

> The eicar test file is 68 bytes long which is .68 MB which rounded to two 
> significant digits is 0.00 MB both scanned and read.
> 
> There are various limits, depending on file and archive types as to how much 
> is read and/or scanned. In most cases they will be exactly the same.
> 
> -Al-
> 
> > On Nov 1, 2020, at 19:40, Ankur Sharma via clamav-users 
> >  wrote:
> > 
> > Hi All,
> > 
> > I tried to scan an eicar test file and got the following scan output:
> > 
> > {'Scanning /tmp/bucket-file-upload/eicar_com.zip!ZIP': 'eicar.com 
> > ', '/tmp/bucket-file-upload/eicar_com.zip': 
> > 'Win.Test.EICAR_HDB-1 FOUND', 
> > '/tmp/bucket-file-upload/eicar_com.zip!(1)ZIP': 'eicar.com 
> > : Win.Test.EICAR_HDB-1 FOUND', 'Known viruses': 
> > '8931107', 'Engine version': '0.102.4', 'Scanned directories': '0', 
> > 'Scanned files': '1', 'Infected files': '1', 'Data scanned': '0.00 MB', 
> > 'Data read': '0.00 MB (ratio 0.00:1)', 'Time': '22.963 sec (0 m 22 s)'}
> > 
> > Though it correctly mentions that the 'Infected files' is '1'. It mentions 
> > that data scanned and data read is 0.00 MB. Can someone please help me and 
> > confirm what is Data read and Data scanned ? How are these different?
> > 
> > Thanks a lot for your time.
> > 
> > -- 
> > regards
> > Ankur  

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-01 Thread Al Varnell via clamav-users
The eicar test file is 68 bytes long which is .68 MB which rounded to two 
significant digits is 0.00 MB both scanned and read.

There are various limits, depending on file and archive types as to how much is 
read and/or scanned. In most cases they will be exactly the same.

-Al-

> On Nov 1, 2020, at 19:40, Ankur Sharma via clamav-users 
>  wrote:
> 
> Hi All,
> 
> I tried to scan an eicar test file and got the following scan output:
> 
> {'Scanning /tmp/bucket-file-upload/eicar_com.zip!ZIP': 'eicar.com 
> ', '/tmp/bucket-file-upload/eicar_com.zip': 
> 'Win.Test.EICAR_HDB-1 FOUND', '/tmp/bucket-file-upload/eicar_com.zip!(1)ZIP': 
> 'eicar.com : Win.Test.EICAR_HDB-1 FOUND', 'Known viruses': 
> '8931107', 'Engine version': '0.102.4', 'Scanned directories': '0', 'Scanned 
> files': '1', 'Infected files': '1', 'Data scanned': '0.00 MB', 'Data read': 
> '0.00 MB (ratio 0.00:1)', 'Time': '22.963 sec (0 m 22 s)'}
> 
> Though it correctly mentions that the 'Infected files' is '1'. It mentions 
> that data scanned and data read is 0.00 MB. Can someone please help me and 
> confirm what is Data read and Data scanned ? How are these different?
> 
> Thanks a lot for your time.
> 
> -- 
> regards
> Ankur


smime.p7s
Description: S/MIME cryptographic signature

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


[clamav-users] ClamAV Scan - Data Read vs Data Scanned

2020-11-01 Thread Ankur Sharma via clamav-users
Hi All,

I tried to scan an eicar test file and got the following scan output:

{'Scanning /tmp/bucket-file-upload/eicar_com.zip!ZIP': 'eicar.com',
'/tmp/bucket-file-upload/eicar_com.zip': 'Win.Test.EICAR_HDB-1 FOUND',
'/tmp/bucket-file-upload/eicar_com.zip!(1)ZIP': 'eicar.com:
Win.Test.EICAR_HDB-1 FOUND', 'Known viruses': '8931107', 'Engine version':
'0.102.4', 'Scanned directories': '0', 'Scanned files': '1', *'Infected
files': '1',* *'Data scanned': '0.00 MB', 'Data read': '0.00 MB (ratio
0.00:1)'*, 'Time': '22.963 sec (0 m 22 s)'}

Though it correctly mentions that the 'Infected files' is '1'. It mentions
that data scanned and data read is 0.00 MB. Can someone please help me and
confirm what is Data read and Data scanned ? How are these different?

Thanks a lot for your time.

-- 
regards
Ankur
+61481141085

___

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml