[
https://issues.apache.org/jira/browse/COMPRESS-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090153#comment-17090153
]
Peter Lee commented on COMPRESS-510:
------------------------------------
> Commons-compress does not get a CRC in the SevenZFile#readFilesInfo method.
>But the 7z GUI shows me a CRC (784DD132 for test.txt).
Actually the 784DD132 is the CRC checksum for the 'folder'(which is in the
coders info), not the CRC for the file(which is in the substreams info).
Currently Commons Compress won't write the CRC in substreams info(but we can
read and valid the CRC in substream info).
When I talk about 'folder' in 7z, it's not a directory - it's nothing special
but a bunch of files. All the files in the same 'folder' will be regarded as a
single file and be compressed together. This is different from Zip and that's
why it's difficult to have random access in 7z. And the compressed 'folder' has
a CRC checksum.
Seems the 7z GUI is showing the CRC of the folder in coders info if the CRC in
substreams info is absent.
This may looks complicated but it's legal according to the 7z format
specification.
> Multiple retrievals of InputStream for same SevenZFile entry fails
> ------------------------------------------------------------------
>
> Key: COMPRESS-510
> URL: https://issues.apache.org/jira/browse/COMPRESS-510
> Project: Commons Compress
> Issue Type: Bug
> Affects Versions: 1.20
> Reporter: Robin Schimpf
> Assignee: Peter Lee
> Priority: Major
> Attachments: image-2020-04-22-16-55-08-369.png
>
>
> I was trying out the new random access for the 7z files and have one of our
> tests failing where we are trying to read the same entry multiple times
> without closing the archive.
> Reproducing test case (I added this locally to the SevenZFileTest class)
> {code:java}
> @Test
> public void retrieveInputStreamForEntryMultipleTimes() throws IOException {
> try (SevenZFile sevenZFile = new SevenZFile(getFile("bla.7z"))) {
> for (SevenZArchiveEntry entry : sevenZFile.getEntries()) {
> byte[] firstRead =
> IOUtils.toByteArray(sevenZFile.getInputStream(entry));
> byte[] secondRead =
> IOUtils.toByteArray(sevenZFile.getInputStream(entry));
> assertArrayEquals(firstRead, secondRead);
> }
> }
> }
> {code}
> The Exception thrown is
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 2
> at
> org.apache.commons.compress.archivers.sevenz.SevenZFile.buildDecodingStream(SevenZFile.java:1183)
> at
> org.apache.commons.compress.archivers.sevenz.SevenZFile.getInputStream(SevenZFile.java:1436)
> at
> org.apache.commons.compress.archivers.sevenz.SevenZFileTest.retrieveInputStreamForEntryMultipleTimes(SevenZFileTest.java:688)
> ...
> {code}
> A similar test case for e.g. zip works fine
> {code:java}
> @Test
> public void retrieveInputStreamForEntryMultipleTimes() throws IOException {
> try (ZipFile zipFile = new ZipFile(getFile("bla.zip"))) {
> Enumeration<ZipArchiveEntry> entry = zipFile.getEntries();
> while (entry.hasMoreElements()) {
> ZipArchiveEntry e = entry.nextElement();
> byte[] firstRead = IOUtils.toByteArray(zipFile.getInputStream(e));
> byte[] secondRead =
> IOUtils.toByteArray(zipFile.getInputStream(e));
> assertArrayEquals(firstRead, secondRead);
> }
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)