[
https://issues.apache.org/jira/browse/COMPRESS-508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074187#comment-17074187
]
Peter Lee commented on COMPRESS-508:
------------------------------------
> I still don't get why you need to go back. Seems you have the header data for
> each file.
Going back will only happen when you are using ZipFile, not
ZipArchiveInputStream.
> how does it go over the zip file, in the structure you've shown, to get the
>first entry metadata (name&size) ?
I do not need to go over all the zip files. I could just iterate each Locale
File Header(using ZipArchiveInputStream)
> Why does ZipArchiveInputStream succeed getting name, but not size?
This will *ONLY* happen if the entry is using data descriptor. Actually this is
a rare case. Most zip files do not use data descriptor so you can get the name
and size before you extract it.(Because the data descriptor lied after the
compressed file data).
> After parsing the entire InputStream metadata, you could return the result of
> the list/tree of entries.
> This can help for the use case of archive apps, which need to show the list
>of entries in some path inside the zip file anyway, so they would need to
>iterate over the whole metadata anyway...
When you talk about the list of entries in some path inside the zip file, you
are talking about the Central Directory Headers. That's what we are doing in
ZipFile.
> Use a stack of the cached data if needed.
Of course this is much more efficent than you iterate one entry by another.
When you have cached all the data of the zip file, you could directly create a
SeekableByteChannel. Therefore you can directly use ZipFile instead of
ZipArchiveInputStream.
When we are talking about *seekable*, there are several ways implementing it :
we could read all the data in a byte array and build a SeekableByteChannel from
it, or we can use a file to create a SeekableByteChannel. In either case we
could achieve reposition implemention - that's what we need to do in ZipFile.
> Bug: cannot get file size of ArchiveEntry using ZipArchiveInputStream
> ---------------------------------------------------------------------
>
> Key: COMPRESS-508
> URL: https://issues.apache.org/jira/browse/COMPRESS-508
> Project: Commons Compress
> Issue Type: Bug
> Components: Build
> Affects Versions: 1.20
> Environment: Android 9 and Android 10, on both emulator and real
> device .
> Reporter: AD_LB
> Priority: Major
> Attachments: 2020-03-31_20-53-36.png, 2020-04-01_18-28-19.mp4,
> ZipTest.zip, ZipTest2.zip, test.zip
>
>
> I'm trying to use ZipArchiveInputStream to iterate over the items of a zip
> file (which may or may not be a real file on the file-system, which is why I
> use a stream), optionally creating a stream from specific entries.
> One of the operations I need is to get the size of the files within.
> For some reason, it fails to do so. Not only that, but it throws an exception
> when I'm done with it:
> {code:java}
> Error:org.apache.commons.compress.archivers.zip.UnsupportedZipFeatureException:
> Unsupported feature data descriptor used in entry ...
> {code}
> I've attached here 3 files:sample project, the problematic zip file (remember
> that you need to put it in the correct path and grant storage permission),
> and a screenshot of the issue.
> Note that if I open the file using a third party PC app (such as
> [7-zip|https://www.7-zip.org/] ), it works fine, including showing the file
> size inside.
> Files:
> !2020-03-31_20-53-36.png![^test.zip]
> [^ZipTest.zip]
> Here's the relevant code (kotlin) :
>
> {code:java}
> thread {
> try {
> val file = File("/storage/emulated/0/test.zip")
> ZipArchiveInputStream(FileInputStream(file)).use {
> while (true) {
> val entry = it.nextEntry ?: break
> Log.d("AppLog", "entry:${entry.name} ${entry.size} ")
> }
> }
> Log.d("AppLog", "got archive ")
> } catch (e: Exception) {
> Log.d("AppLog", "Error:$e")
> e.printStackTrace()
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)