On 02/16/2010 09:15 PM, Tom Shaw wrote:
At 4:15 PM +0000 2/16/10, Steve Basford wrote:
>
Attached document? I did not see an attachment. Can you send a link?
Is this the TargetType you are after...
2.3.4 Extended signature format
The extended signature format allows for specification of additional
information such as a target file type, virus offset or engine version,
making the detection more reliable. The format is:
MalwareName:TargetType:Offset:HexSignature[:MinEngineFunctionalityLevel:[Max]]
where TargetType is one of the following numbers specifying the type
of the
target file:
0 = any file
1 = Portable Executable
2 = OLE2 component (e.g. a VBA script)
3 = HTML (normalised)
4 = Mail file
5 = Graphics
6 = ELF
7 = ASCII text file (normalised)
And Offset is an asterisk or a decimal number n possibly combined with a
special modifier:
Source: http://www.clamav.com/doc/latest/signatures.pdf
Steve et all,
Yes I know all this, as I told Alain I have read all available docs
but they (nor the wiki) do not explain how a "7" is determined (eg by
extension if so which ones or by contents if so how), are php's and
per'ls considered ascii, portable executable or html or what, what is
an rtf considered an OLE or ascii orwhat, and what does a zeus bin
file get categorized as? Answers for these and many other questions
like these, I have searched the docs to find out with no joy.
Hi Tom,
Didn't my reply answer your question?
[which I've forwarded to -users, but I forgot that it strips
attachments, here it is again]
The file type is determined by signatures in daily.ftm (or the builtin
ones in filetypes_int.h if that is missing) on a portion at the
beginning of the file.
sigtool --unpack-current daily
cat daily.ftm
As for binary versus ascii, utf8, utf16be, utf17le see textdet.c, it
looks at the beginning of the file and determines which one it could be,
based on the ratio of how many good/bad ascii,utf8, etc. characters it
seen.
Also there are some signatures that are detected on the fly (not only at
the beginning of the file), during a type0 scan:
/* bigger numbers have higher priority (in o-t-f detection) */
CL_TYPE_HTML, /* on the fly */
CL_TYPE_MAIL, /* magic + on the fly */
CL_TYPE_SFX, /* foo SFX marker */
CL_TYPE_ZIPSFX, /* on the fly */
CL_TYPE_RARSFX, /* on the fly */
CL_TYPE_CABSFX,
CL_TYPE_ARJSFX,
CL_TYPE_NULSFT, /* on the fly */
CL_TYPE_AUTOIT,
CL_TYPE_ISHIELD_MSI,
These filetypes are used both to determine what signature to match, and
what unpacker to run.
And the mapping from CL_TYPE to signature targettypes is in matcher.h:
{ 0, "GENERIC", 0, 0, 1 },
{ CL_TYPE_MSEXE, "PE", 1, 0, 1 },
{ CL_TYPE_MSOLE2, "OLE2", 2, 1, 0 },
{ CL_TYPE_HTML, "HTML", 3, 1, 0 },
{ CL_TYPE_MAIL, "MAIL", 4, 1, 1 },
{ CL_TYPE_GRAPHICS, "GRAPHICS", 5, 1, 0 },
{ CL_TYPE_ELF, "ELF", 6, 1, 0 },
{ CL_TYPE_TEXT_ASCII, "ASCII", 7, 1, 1 },
/* note that this actually inclludes utf8, utf16be, and utf16le too! */
{ CL_TYPE_ERROR, "NOT USED", 8, 1, 0 },
{ CL_TYPE_MACHO, "MACH-O", 9, 1, 0 }