[
https://issues.apache.org/jira/browse/COMPRESS-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009810#comment-17009810
]
Stefan Bodewig commented on COMPRESS-494:
-----------------------------------------
I just re-read your output from a while back:
{code:java}
61642628 Stored 61642628 0% 10-31-2019 01:43 454789de
cloud_3672_20191031010003.log.gz// code placeholder
{code}
This is a {{STORED}} entry using a data descriptor and you are reading it with
{{ZipArchiveInputStream}}? I.e. you explicitly have set the
{{allowStoredEntriesWithDataDescriptor}} constructor arg to {{true}}?
In this case it is quite possible this can just not work. See all the warning
we've put into the code and on the web-site.
Let me explain.
ZIP archives store the real meta-data at the very end of the archive. So this
is not accessible for {{ZipArchiveInputStream}}. {{ZipFile}} can use that and
will work a lot better than {{ZipArchiveInputStream}} - unless you cannot use
it, of course.
ZIP archives store a subset of meta-data just before the data of each entry.
This is the so called local file header. This also includes the size. Or rather
it can also include the size.
An entry can signal it writes the sizes directly after the content of the entry
instead of putting it ino the local file header. This works well for formats
that can signal an entry is complete like for {{DEFLATED}} entries (the content
contains an end-of-stream marker).
Stored entries are what you would expect from their name. The real content of
the entry is stored byte-by-byte. If a data-descriptor is used for a stored
entry, {{ZipArchiveInputStream}} "simply" reads all content until it finds
something that looks like a data-descriptor (actually its even worse, it also
has to look for something that looks like the start of the next ZIP entry) and
if it does, uses what it finds. Sometimes the stored content of an entry itself
contains a sequence of bytes that looks like what {{ZipArchiveInputStream}} is
searching for. In this case all bets are off and there is no way for
{{ZipArchiveInputStream}} to read this entry. No workaround possible.
> ZipArchieveInputStream component is throwing "Invalid Entry Size"
> -----------------------------------------------------------------
>
> Key: COMPRESS-494
> URL: https://issues.apache.org/jira/browse/COMPRESS-494
> Project: Commons Compress
> Issue Type: Bug
> Affects Versions: 1.8, 1.18
> Reporter: Anvesh Mora
> Priority: Critical
> Attachments: commons-compress-1.20-SNAPSHOT.jar
>
>
> I've observed in my development in certain zip files which we are able to
> extract with with unzip utility on linux is failing with our Compress library.
>
> As of now I've stack-trace to share, I'm gonna add more in here as on when
> discussion begins on this:
>
> {code:java}
> Caused by: java.lang.IllegalArgumentException: invalid entry size
> at
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.setSize(ZipArchiveEntry.java:550)
> at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.readDataDescriptor(ZipArchiveI
> nputStream.java:702)
> at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.bufferContainsSignature(ZipArc
> hiveInputStream.java:805)
> at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.readStoredEntry(ZipArchiveInpu
> tStream.java:758)
> at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.readStored(ZipArchiveInputStre
> am.java:407)
> at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.jav
> a:382)
> {code}
> I missed to add version info, below are those:
> version of lib I'm using is: 1.9
> And I did try version 1.18, issue is observed in this version too.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)