Message: 2
Date: Fri, 16 Nov 2007 4:08:01 EST
From: Larry Stone <[EMAIL PROTECTED]>
Subject: Re: [Dspace-tech] Viruses and DSpace
To: "Blanco, Jose" <[EMAIL PROTECTED]>
Cc: [email protected]
Message-ID: <[EMAIL PROTECTED]>

> Has any thought been given to how Dspace might handle the remote (
> hopefully ) possibility of a file containing a virus being deposited
> into a repository?  It seems like jhove might be the kind of tool that
> could check for this.  I believe there is some work going on to
> incorporate jhove into Dspace, how is that coming along?  It's not part
> of of 1.5, but what about for the following release?

The BitstreamFormat renovation (see
http://wiki.dspace.org/index.php/BitstreamFormat_Renovation ) doesn't
address this directly, but will make it much easier to integrate tools
because file formats will be identified more effectively and precisely.

Once the format is known you can add a mechanism like the mediafilters,
perhaps integrated with workflow, to run specific checks depending on
the format type.

JHOVE version 1 is just a format validator and technical-metadata
extractor, it isn't subtle enough to look for viruses.

There _are_ tools in the email filtering domain which detect malicious
MS Office files; I've heard of them but don't remember specifics.  You
could start by looking around the SpamAssassin software and ClamAV
(see http://www.clamav.net/ )  However, be aware that any virus-checking
software needs constant updating since you're essentially in an arms race.

    -- Larry (a recovering postmaster)



I realize this may not be particularly relevant to your question, but we
handle virus checking as part of the ingest process.  ClamAV has Python
bindings that make it very simple to integrate into our Python-based
batch ingest processes.  We also run ClamAV against the Ubuntu-based
DSpace server and Solaris-based storage array containing the asset
stores.  Providing feedback to data contributors about malformed,
corrupt, incomplete, and infected data sets is a very important part of
our pre-ingest workflow.

We use Jhove for file validation and look forward to more geospatial
format support.  Jhove is excellent at what it does, but I wouldn't
anticipate that it will support virus scanning in the future.

In addition, we use the Unix file utility and magic numbers to look for
executables, since that would be pretty suspicious for geospatial data.



Jim

-- 
-------------------------------
Jim Tuttle
Geospatial Data Librarian

NCSU Libraries, Box 7111
North Carolina State University
Raleigh, NC 27695-7111
jim_tuttle at ncsu.edu

(919)513-0651 Phone
(919)515-3031  Fax


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to