On 2009-01-26 20:45, Tom Shaw wrote:
> I have run into some problems creating rules. I 
> am trying to create phish rules as
>
> R[Filter]:RealURL:DisplayedURL[:FuncLevelSpec]
> or
> MalwareName:TargetType:Offset:HexSignature[:MinEngineFunctionalityLevel:[Max]]
>
> and I am having two problems.
>
> First problem has to do with UTF/UNICODE 
> characters as well as various codepages which are 
> used in place of ascii in spam and phish. What 
> makes this more difficult is that one email might 
> contain ascii, another UTF, and yet another 
> Latin-2 all representing the same signature. So 
> how does one create a regex for the "R" rules 
> and/or a HEX sequence that can deal with various 
> character sets?
>   

The html-normalizer takes care of this, unicode character outside the
127 ascii range get converted into
&<character-code>;.

The easiest way to know what the phishing code is looking at is to run
clamscan --debug and grep for 'Phish', it will
show the exact,normalized URLs it  is looking at.

However for .pdb signatures, we found type 'H' to be sufficient, where
you simply list the domain.
In fact the official signatures never used 'R' type.

You'll only need the 'R' type if you want to match the domain that hosts
the phishing site. Is that what you want?

Regular expression are useful for the whitelist (wdb format), where they
are type 'X' signatures.

> My second source of confusion is with target type. The options are
>
> * 0 = any Þle
> * 1 = Portable Executable
> * 2 = OLE2 component (e.g. a VBA script)
> * 3 = HTML (normalised)
> * 4 = Mail file
> * 5 = Graphics
> * 6 = ELF
> * 7 = ASCII text Þle (normalised)
>   

These are types for .ndb signatures, and are not needed/valid for
.pdb/.wdb signatures.

Best regards,
--Edwin
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml

Reply via email to