AIP Packager should support TAR / TAR.GZ as well as ZIP
-------------------------------------------------------
Key: DS-1137
URL: https://jira.duraspace.org/browse/DS-1137
Project: DSpace
Issue Type: Improvement
Components: DSpace API
Affects Versions: 1.8.2, 1.8.1, 1.8.0, 1.7.2, 1.7.1, 1.7.0
Reporter: Tim Donohue
Priority: Major
Fix For: 3.0
This improvement request is based on a dspace-tech email thread:
http://www.mail-archive.com/[email protected]/msg16350.html
Essentially, in Java 6 and below, Java Zip (java.util.zip) has file size
limitations. The Zip file itself cannot be more that 4GB in size.
In Java 7, this has been fixed so that Zip64 is now supported
(https://blogs.oracle.com/xuemingshen/entry/zip64_support_for_4g_zipfile).
However, in any case, the AIP Packager should also support TAR or TAR.GZ (TGZ)
as an option.
This change would likely necessitate the following:
(1) Change the 'writeZipPackage()' method of
'org.dspace.content.packager.AbstractMETSDisseminator' so that it is
configurable to use either:
- java.util.zip.ZipOutputStream (and associated classes)
- OR, org.apache.commons.compress.archivers.tar.TarArchiveOutputStream
(and associated classes)
(2) Change the 'parsePackage()' method of
'org.dspace.content.packager.AbstractMETSIngester' so that it is also
configurable to use either:
- java.util.zip.ZipInputStream (and associated classes)
- OR, org.apache.commons.compress.archivers.tar.TarArchiveInputStream
(and associated classes)
I've not tried this change, so more classes may be affected.
Ideally, this would be made configurable, so that institutions could choose
whether to use Zip or Tar. As an example, see how the Replication Task Suite's
BagIt packer can support either Zip or TGZ archiving format:
https://github.com/DSpace/dspace-replicate/blob/master/src/main/java/org/dspace/pack/bagit/Bag.java#L308
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel