On Friday 19 January 2007 06:46, Stuart Henderson wrote:
> On 2007/01/19 04:56, J.C. Roberts wrote:
> > Anyhow, I'm working on an updated archivers/unarj port for use with
> > clamav so you can scan inside ARJ archives. Though the current port
> > shows unarj has an "x" switch to extract files from ARJ archives
> > with path names, said switch doesn't work, and never has worked
> > since the code to create directories is just plain missing.
> >
> > So my question for the clamav users is how the heck does clamav
> > work with compressed archives?
>
> clamscan calls 'arj x -y' (n.b. arj, not unarj). For scanning emails,
> I think only MailScanner users are likely to use clamscan (they don't
> mind the startup overhead so much since they batch the mail up and
> then scan it).
>

Thanks Stuart. Since we do not have the GPL'd arj (sourceforge) in the 
ports tree, and the proprietary version (arjsoftware.com) does not have 
UNIX support, people are most likely not using the clamscan method.

The one thing I can say with certainty is the GPL'd version of arj on 
sourceforge is some of the worst and most dangerous C code I've ever 
seen. Though I absolutely hate to ever utter the words "rewrite" or 
"fork" it may be justified. Sure, it will run with some minor patches, 
but in trying to actually correcting the damn thing, I've done 
countless patches but I'm still a *long* way from completing the port.

> Other mail virus-scanners I've seen use clamd, so you need to look at
> the mail-scanner and see what _it_ does;
>
> smtp-vilter and clamsmtp pass files straight to clamd, so .arj/rar
> are all passed (unless a clamav signature matches the entire
> archive), or smtp-vilter users might block them by filename, but you
> can't scan individual archive members.
>
> amavisd-new unpacks the files itself before feeding to clamd, this
> uses either 'unarj e' or arj with some complex set of parameters, but
> it makes my brain hurt to read even just their sample config let
> alone the code..ugh..how can an email scanner be more hassle to
> configure than, oh say, totally setting up Opus-CBCS...

Using 'unarj -e' gives a false sense of security, since you can not 
actually scan all the files in an archive. Duplicated file names stored 
in different directories within the archive (from the original source 
of said files), quietly fail to be extracted, so they are never 
scanned.

In the unarj port, I can add support for the -x switch. Is this a good 
way to deal with it?

jcr

Reply via email to