On 2008-12-30 13:44, Babu.N wrote:
> Hi Edwin,
>
> Thanks for the response.
>
> Please see inline..
>
>
> At 05:26 PM 12/29/2008, Török Edwin wrote:
>   
>> On 2008-12-29 12:53, Babu.N wrote:
>>     
>>> Hi,
>>>
>>> I am developing SHIM layer for ClamAV to support Freescale pattern
>>> matching hardware. Could you please clarify a few queries:
>>>
>>> 1. Freescale has a pattern matching engine with 64k pattern capacity.
>>>
>>>       
>> How long can the patterns be? Does it support wildcards?
>> Does it support regular expressions?
>>     
>
> Yes.
>
>   

There has to be a limit on the size of a regular expression, or else I
could upload a 2Gb regular expression into it ;)


>>> But clamAV has approx 169000 signatures. This means hardware engine
>>> will not be able to accomodate all the signatures.
>>>       
>> What if you combine N patterns into a single regular expression
>> (hardware limits allowing).
>> If there is a match, then you use software to tell which of the N
>> patterns matched.
>>     
>
> After hardware reports a match in a combined 
> regex, how can software distinguish which sub-regex actually matched ?
>   

By matching with a specialized trie for the candidate sub-regexes.
For example lets assume you combine patterns 1, 74, and 192 into a
single regex for hardware matching.
When the hardware reports a match, in software you only need to try
matching with a trie containing signatures 1, 74, and 192, which should
be very fast.

Keep in mind that in a real situation most files you scan are clean, and
you should get matches only for when the file is infected.
Of course there are also the on-the-fly filetype signatures
(html/pe/sfx), which tend to match quite often.

But you already speed up the situation a lot, if you are able to
determine in hardware that software only needs to match with a trie that
has 4-5 patterns.
Of course those tries should be prebuilt.

Also patterns that are part of logical signatures need special treatment
(you need to count how many times the sub-signatures matched).

> I have gone through the function reload_db. It is 
> first freeing the existing signatures (cl_free) & 
> then loading the new signatures ? which code path 
> should I follow to understand that old signatures 
> are not released till the last thread finishes it's processing ?
>   

cl_engine_free only drops reference count. When refcount is zero, then
it is freed, otherwise it isn't.

Best regards,
--Edwin
_______________________________________________
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net

Reply via email to