On 2008-12-03 03:02, Thomasz Blaszczyk wrote:
>  Hi,
>   

Hi,


>  I am new to CLAMAV & I am just wonder how files are scanned.
>
>  Does it work like:
>  1. PE section is taken from file to be scanned
>   

It is much more than that, ClamAV can also process a variety of archive
formats, containers, and executable packers.
Also PE files aren't the only malware files, you can have malware in
scripts too.

Have a look at filetypes_int.h for the file types we support. New file
type definitions can be added via database updates.
>  2. MD5 is calculated
>   

Correct, but ClamAV also uses a pattern matcher (Aho-Corasick and
extended version of Boyer-Moore), not only MD5.
See signatures.pdf for the kind of patterns it supports (in particular
it supports wildcards with AC matcher).

So ClamAV actually tries to match those patterns inside the file. It
also has some heuristic and algorithmic detections.

There is an MD5 calculated for the entire file, and MD5 calculated per
PE section too.

>  3. That MD5 is compared to all signatures in ClamAV Database
>   

Using a BM matcher, yes. Not sequentially.


>  4. If match virus is found.
>   

Yes.

>  I have simplified this. But please let me know if I am right in above
>  steps for scanning files.

If you only have a database with md5 loaded, and disable archives, and
disable algorithmic scans, and heuristics, and disable html, mbox
formats, then yes ;)
In practice, ClamAV does much more than just matching an MD5.

Best regards,
--Edwin
_______________________________________________
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net

Reply via email to