Hey guys, Thanks for comments!
I got something interesting, measurements and graphs: http://omploader.org/vMTFlYw & I add my comment: "Intersections where Boyer-Moore improves performance and where Aho-Corasick gets slower is shifting to the right as amount of patterns is increasing (together with file size)." I am looking forward for feedback, especially from Edwin, was the measurements correct? Do results conform to your thoughts/expectation about those algorithms? GiM it seems that BM is efficient for same cases, see link above. (BTW,Thanks for brief history of algorithms implemented in previous CLAMAV versions) My second comment: "The measurements were performed 3 times, For 3 trials per measurement a small difference can occur. This small deviation is in around ~ 30 ms max and can be neglected. Average value from 3 trials has been taken. Very amazing results can be observed. For small number of signatures (20, 100, 200 signatures in database) Aho-Corasick is much better than Boyer-Moore, To scan 2MB file it takes 2 times longer for Boyer-Moore algorithm than Aho-Corasick. But for more than 10.000 signatures in database for the same size of file Boyer-Moore make up for lost time. Still for larger files ( with size greater than 3 MB) for the same signature database Aho-Corasick is better." Greetings, Tom On Sat, Dec 20, 2008 at 7:36 PM, GiM <g...@skrzynka.pl> wrote: > Thomasz Blaszczyk in message 'Re: [Clamav-devel] clamAV scanning algorithm' > wrote: >> > >> > if you switch ClamAV to use only AC, you'll notice a significant >> > performance improvement, at the expense of increased memory usage for >> > the DB. >> Right, AC trees are quite large and takes lot of memory.. >> So BM is only used to save memory? I guess, it was implemented first >> and some people still feel sentiment to this algorithm..:) >> Since AC works faster and handles wildcards... >> > > If I remember correctly it was just on the contrary, > AC was implemented first (version I recall was 0.67, > I haven't looked at a code of earlier versions except > some truly historical stuff like 0.11 or 0.15) > and later because of memory problems Edwin mentioned > at least a few times BM was added to resolve those > issues. > > cheers, > -- > main(int a[puts("Michal 'GiM' Spadlinski")]){} > -- "Stay Hungry. Stay Foolish." Steve Jobs _______________________________________________ http://lurker.clamav.net/list/clamav-devel.html Please submit your patches to our Bugzilla: http://bugs.clamav.net