On 2008-12-16 01:13, Thomasz Blaszczyk wrote:
> Hello,
> I just reviewed few multi-pattern string scanning algorithms.
> And there are many variants for multi-pattern for Boyer-Moore.
> I am curious if the one implemented in Clamav is Boyer-Moore-Horspool
> or the one taken from authors of GLIMPSE or Set-wise Boyer-Moore? or
> AC_BM proposed by Silicon Defence?
> Any hints?

It is not AC_BM.
Hint: it uses a hash.

>> You can't get a BMOnly, because some signatures *require* the AC
>> matcher, such as any signature containing wildcards.
> OK, all signatures without wildcards(static signatures) are used by BM
> multi-pattern.
> The one with wildcards are used by AC.

Read cli_parse_add(). Yes, any of the wildcards listed in signatures.pdf
cause the signature to be loaded in the AC trie.

>  Is that rule? Or there are
> other factors as well?

But signatures for certain filetypes are always loaded in the AC trie
even if they are static, see cli_mtargets.

Have a look at the --debug output, it tells you how many signatures have
been loaded in which trie, and for which filetype.

> How much BM slow down scanner with signatures with wildcards? Did
> someone perform such analysis?

I don't think wildcards were ever implemented in matcher-bm.c, so I
don't know.
BM is good for longer signatures, and it uses less memory than AC.
However if you switch ClamAV to use only AC, you'll notice a significant
performance improvement, at the expense of increased memory usage for
the DB.
Since BM is slower implementing wildcard support doesn't provide any

P.S.: If you're running on multicore you can use multiscan to use all of
your cores: clamdscan -m

Some more things to watch out for when benchmarking:
- performance depends very much on what kind of data you scan
(executables, mails, small files, large files)
- there are scalability issues with the Linux kernel (google for
mmap_sem and ClamAV)
- you should have fast disks, so that you're sure you're benchmarking
ClamAV and not your I/O system

Best regards,
Please submit your patches to our Bugzilla: http://bugs.clamav.net

Reply via email to