[
https://jira.duraspace.org/browse/DS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=19244#action_19244
]
Tim Donohue commented on DS-638:
--------------------------------
Just wanted to again point out that currently this issue involves both virus
checking and file format identification.
I think Robin has already detailed a potential use case for virus checking in
the Submission UI. I'd agree with that case. From what I've seen, many
institutions do not enable workflows for the majority of their DSpace
Collections (we could ask this question to folks in DCAT as a mini-survey, to
verify as needed). To go back and force an institution to retroactively enable
workflows on all their existing collections (in order to get virus checking
enabled) seems a bit odd to me, to be honest.
I'd also point out that integration of Curation Framework into the Submission
UI may also be necessary for improved *file format identification*. Obviously,
the Submission UI already has a very simplistic file format identification (by
just checking the file extension and matching it against the DSpace Bitstream
Format Registry). It seems that once there is a DROID-based Curation task, it
would be ideal to use this as a way to improve upon our simplistic file format
identification processes (as DROID is a file format identification tool).
That being said -- I agree with some of your concerns Richard. In reality part
of the issue here is that so much of this submission processing logic is at the
application level, so SWORD doesn't share the same logic as JSPUI or XMLUI.
But, I think this particular JIRA issue is only requesting submission UI
integration for JSPUI & XMLUI (which actually *do* use the same submission
processing logic -- see the org.dspace.submit.step.* classes).
In the end, this might be an area where we need to offer both options (either
enabling in Submission UI or in Workflow), based on your local institution's
policies. Perhaps we default to only performing these checks in one place, but
allow an institution to choose otherwise as they see fit?
> check files on input for viruses, and verify file format
> ---------------------------------------------------------
>
> Key: DS-638
> URL: https://jira.duraspace.org/browse/DS-638
> Project: DSpace
> Issue Type: New Feature
> Components: JSPUI
> Affects Versions: 1.6.2
> Environment: to use this patch you will need to have ClamAV, and
> jhove installed on your system.
> Reporter: Jose Blanco
> Assignee: Robin Taylor
> Attachments: java_files.zip, jhove_config_files.zip, jsp_files.zip
>
>
> This patch uses JHOVE to provide rough-and-ready format checking by
> identifying that the file/bitstream extension matches formats verifiable by
> JHOVE. (Currently DSpace accepts a deposit's file extension as gospel, so a
> user could tack a ".txt" extension onto a GIF and DSpace would assign the
> incorrect format to the file based on that incorrect extension.)
> This patch also also contains code to check the file for the presence of
> viruses.
> In order to use this patch you must have jhove and ClamAV installed on your
> system.
> Important notes:
> (1) HTML identification has proved unreliable ( by jhove ), so this patch
> does not return accurate results for that
> file format.
> (2) This code does not fully incorporate JHOVE's validation functions; it
> only verifies that what depositors intended to submit is in fact what they
> submitted.
> The following are returned messages when an error is detected:
> Text in [brackets] is a returned value, ALLCAPS can/should be modified to
> reflect your current installation.
> Questionable AIFF, GIF, JPG, PDF, TIF, WAVE, XML:
> DSPACE could not verify that your file is a valid [file_format_extension].
> Please check the file format and ".[file_format_extension]" extension.
> Questionable TXT:
> DSPACE found the text file you are trying to upload is neither UTF-8 nor
> ASCII. Please verify that your file is in the format you wanted.
> Spaces in filenames ( this is an additional check ):
> The file name contains spaces; this is not recommended. If possible, please
> replace spaces with underscores: "_".
> Virus detected:
> DSPACE detected a virus in this file. Please repair it and resume the
> deposit. If you need assistance, please contact us: EMAIL_ADDRESS.
> To get the patch working:
> Add the jhove conf files to
> [dspace]/jhove direcoty
> Here are the conf files:
> jhove-aiff.conf
> jhove-ascii.conf
> jhove-gif.conf
> jhove-jpeg.conff
> jhove-pdf.conf
> jhove-tiff.conf
> jhove-utf8.conf
> jhove-wave.conf
> jhove-xml.conf
> Also the following files were changed:
> dspace-api/src/main/java/org/dspace/submit/step/UploadStep.java
> dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/submit/step/JSPUploadStep.java
> dspace-api/src/main/java/org/dspace/content/FormatIdentifier.java
> dspace/modules/jspui/src/main/webapp/submit/get-file-format.jsp ( locally
> customized )
> dspace/modules/jspui/src/main/webapp/submit/upload-error-virus.jsp ( new file
> - placed in locally modified area for the jspui interface)
> These files are attached with this patch.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel