AIP Packager should support TAR / TAR.GZ as well as ZIP
-------------------------------------------------------

                 Key: DS-1137
                 URL: https://jira.duraspace.org/browse/DS-1137
             Project: DSpace
          Issue Type: Improvement
          Components: DSpace API
    Affects Versions: 1.8.2, 1.8.1, 1.8.0, 1.7.2, 1.7.1, 1.7.0
            Reporter: Tim Donohue
            Priority: Major
             Fix For: 3.0


This improvement request is based on a dspace-tech email thread: 
http://www.mail-archive.com/[email protected]/msg16350.html

Essentially, in Java 6 and below, Java Zip (java.util.zip) has file size 
limitations.  The Zip file itself cannot be more that 4GB in size.

In Java 7, this has been fixed so that Zip64 is now supported 
(https://blogs.oracle.com/xuemingshen/entry/zip64_support_for_4g_zipfile).

However, in any case, the AIP Packager should also support TAR or TAR.GZ (TGZ) 
as an option.

This change would likely necessitate the following:

(1) Change the 'writeZipPackage()' method of 
'org.dspace.content.packager.AbstractMETSDisseminator' so that it is 
configurable to use either:
     - java.util.zip.ZipOutputStream (and associated classes)
     - OR, org.apache.commons.compress.archivers.tar.TarArchiveOutputStream 
(and associated classes)

(2) Change the 'parsePackage()' method of 
'org.dspace.content.packager.AbstractMETSIngester' so that it is also 
configurable to use either:
      - java.util.zip.ZipInputStream (and associated classes)
      - OR, org.apache.commons.compress.archivers.tar.TarArchiveInputStream 
(and associated classes)

I've not tried this change, so more classes may be affected.  

Ideally, this would be made configurable, so that institutions could choose 
whether to use Zip or Tar.  As an example, see how the Replication Task Suite's 
BagIt packer can support either Zip or TGZ archiving format: 
https://github.com/DSpace/dspace-replicate/blob/master/src/main/java/org/dspace/pack/bagit/Bag.java#L308

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to