[
https://issues.apache.org/jira/browse/COMPRESS-477?focusedWorklogId=341701&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-341701
]
ASF GitHub Bot logged work on COMPRESS-477:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 12/Nov/19 07:09
Start Date: 12/Nov/19 07:09
Worklog Time Spent: 10m
Work Description: PeterAlfreadLee commented on pull request #86:
COMPRESS-477 building a split zip
URL: https://github.com/apache/commons-compress/pull/86
[COMPRESS-477](https://issues.apache.org/jira/projects/COMPRESS/issues/COMPRESS-477)
Add support for building a split/spanned zip.
Sample code:
```
@Test
public void buildSplitZipTest() throws IOException {
File directoryToZip = getFilesToZip();
File outputZipFile = new File(dir, "splitZip.zip");
long splitSize = 100 * 1024L; /* 100 KB */
final ZipArchiveOutputStream zipArchiveOutputStream = new
ZipArchiveOutputStream(outputZipFile, splitSize);
addFilesToZip(zipArchiveOutputStream, directoryToZip);
zipArchiveOutputStream.close();
// TODO: validate the created zip files when extracting split zip is
merged into master
}
private void addFilesToZip(ZipArchiveOutputStream zipArchiveOutputStream,
File fileToAdd) throws IOException {
if(fileToAdd.isDirectory()) {
for(File file : fileToAdd.listFiles()) {
addFilesToZip(zipArchiveOutputStream, file);
}
} else {
ZipArchiveEntry zipArchiveEntry = new
ZipArchiveEntry(fileToAdd.getPath());
zipArchiveEntry.setMethod(ZipEntry.DEFLATED);
zipArchiveOutputStream.putArchiveEntry(zipArchiveEntry);
IOUtils.copy(new FileInputStream(fileToAdd),
zipArchiveOutputStream);
zipArchiveOutputStream.closeArchiveEntry();
}
}
```
This PR is implemented by adding a new class `ZipSplitOutputStream`, and
it's mainly implemented like this:
1. Write the zip split signature to the zip file in the constructor of
`ZipSplitOutputStream` by calling `writeZipSplitSignature`;
2. Based on the zip specification, the split size must between 64K and
4,294,967,295 bytes;
3. Rename the split zip files like .z01, .z02, ... , .z(N-1), .zip ONLY IF
there are more than 1 split segment;
4. Get the only split segment whose suffix is .zip IF the split size is big
enough(it means the split size is bigger than the actual zip size);
5. Create a new zip split segment if the size of data to write exceeds split
size, and the newly created zip segment will be named in the sequence like
.z01, .z02, ..., .z99, .z100, .z101, ... , .zip;
6. Based on the zip specification, the End Of Central Directory(EOCD) and
Zip64 End Of Central Directory Locator(Zip64_EOCDL) must reside on the same
segment, so the `ZipSplitOutputStream` will create a new segment if the
remaining size is not enough before writing EOCD and Zip64_EOCDL;
7. When creating `ZipArchiveOutputStream`, if the split size is specified,
it will create a split zip instead of normal zip(as the `ZipSplitOutputStream`
need the file name when creating new split segments, the constructor is like
`public ZipArchiveOutputStream(final File file, final long zipSplitSize)`);
8. The disk number, relative offset, number of this disk, number of Central
Directories on this disk, total number of disks in Central Directory, Zip64 End
Of Central Directory, Zip64 End Of Central Directory Locator, End Of Central
Directory have all been tuned to the right value when writing a split/spanned
zip;
9. The testcases need to be updated when
[#84]{https://github.com/apache/commons-compress/pull/84} is merged because it
seems I can not test my created split/spanned zip in Linux. I have tested it on
Windows and it works well;
10. This PR has some minor conflicts with
[#84]{https://github.com/apache/commons-compress/pull/84}, and I will solve all
these conflicts as soon as
[#84]{https://github.com/apache/commons-compress/pull/84} is merged.
Please feel free to let me know if the code need to be refactored or
rebased. I'm looking for your reviews. :-) @bodewig @garydgregory
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 341701)
Time Spent: 3h (was: 2h 50m)
> Support for splitted zip files
> ------------------------------
>
> Key: COMPRESS-477
> URL: https://issues.apache.org/jira/browse/COMPRESS-477
> Project: Commons Compress
> Issue Type: New Feature
> Components: Archivers
> Affects Versions: 1.18
> Reporter: Luís Filipe Nassif
> Priority: Major
> Labels: zip
> Time Spent: 3h
> Remaining Estimate: 0h
>
> It would be very useful to support splitted zip files. I've read
> [https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT] and understood
> that simply concatenating the segments and removing the split signature
> 0x08074b50 from first segment would be sufficient, but it is not that simple
> because compress fails with exception below:
> {code}
> Caused by: java.util.zip.ZipException: archive's ZIP64 end of central
> directory locator is corrupt.
> at
> org.apache.commons.compress.archivers.zip.ZipFile.positionAtCentralDirectory64(ZipFile.java:924)
> ~[commons-compress-1.18.jar:1.18]
> at
> org.apache.commons.compress.archivers.zip.ZipFile.positionAtCentralDirectory(ZipFile.java:901)
> ~[commons-compress-1.18.jar:1.18]
> at
> org.apache.commons.compress.archivers.zip.ZipFile.populateFromCentralDirectory(ZipFile.java:621)
> ~[commons-compress-1.18.jar:1.18]
> at
> org.apache.commons.compress.archivers.zip.ZipFile.<init>(ZipFile.java:295)
> ~[commons-compress-1.18.jar:1.18]
> at
> org.apache.commons.compress.archivers.zip.ZipFile.<init>(ZipFile.java:280)
> ~[commons-compress-1.18.jar:1.18]
> at
> org.apache.commons.compress.archivers.zip.ZipFile.<init>(ZipFile.java:236)
> ~[commons-compress-1.18.jar:1.18]
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)