[
https://issues.apache.org/jira/browse/COMPRESS-508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073724#comment-17073724
]
AD_LB commented on COMPRESS-508:
--------------------------------
I think this format makes sense in a way too, though, for the case of writing:
If you want to add a file, its entire group of data is added to the end, with
some modifications that might not push stuff before this new data.
I still don't get why you need to go back. Seems you have the header data for
each file. It doesn't tell you which path it exists in? Of which parent? Or you
say you first need to find the path by going over the entire zip file, and then
gather the entries for it?
Suppose you get an iterator of the entries using ZipFile class, how does it go
over the zip file, in the structure you've shown, to get the first entry
metadata (name&size) ? Why does ZipArchiveInputStream succeed getting name, but
not size?
There is another possible solution:
Use InputStream to go over each of those groups, and ignore the real data of
the entries (meaning checking only metadata) .
Use a stack of the cached data if needed. You say you need to go back, so it
would have saved you the need to actually go back in storage, as you have
handled it before and stored what you need.
After parsing the entire InputStream metadata, you could return the result of
the list/tree of entries.
The only downside to this way : need memory, both during the parsing (to hold
metadata to return to) and after (to hold the metadata of all entries). This
won't be a problem for zip files that have a few files, but could be an issue
with tons of files inside.
To improve it a bit, you could set where to scan : the root, or some entry
within. This can help for the use case of archive apps, which need to show the
list of entries in some path inside the zip file anyway, so they would need to
iterate over the whole metadata anyway...
Using an iterator, if you say you need to go back, it might cause page-misses,
as you depend on storage. Might also affect speed of processing.
Of course, this is a trade of memory, and using an iterator might be more
memory efficient and safer.
> Bug: cannot get file size of ArchiveEntry using ZipArchiveInputStream
> ---------------------------------------------------------------------
>
> Key: COMPRESS-508
> URL: https://issues.apache.org/jira/browse/COMPRESS-508
> Project: Commons Compress
> Issue Type: Bug
> Components: Build
> Affects Versions: 1.20
> Environment: Android 9 and Android 10, on both emulator and real
> device .
> Reporter: AD_LB
> Priority: Major
> Attachments: 2020-03-31_20-53-36.png, 2020-04-01_18-28-19.mp4,
> ZipTest.zip, ZipTest2.zip, test.zip
>
>
> I'm trying to use ZipArchiveInputStream to iterate over the items of a zip
> file (which may or may not be a real file on the file-system, which is why I
> use a stream), optionally creating a stream from specific entries.
> One of the operations I need is to get the size of the files within.
> For some reason, it fails to do so. Not only that, but it throws an exception
> when I'm done with it:
> {code:java}
> Error:org.apache.commons.compress.archivers.zip.UnsupportedZipFeatureException:
> Unsupported feature data descriptor used in entry ...
> {code}
> I've attached here 3 files:sample project, the problematic zip file (remember
> that you need to put it in the correct path and grant storage permission),
> and a screenshot of the issue.
> Note that if I open the file using a third party PC app (such as
> [7-zip|https://www.7-zip.org/] ), it works fine, including showing the file
> size inside.
> Files:
> !2020-03-31_20-53-36.png![^test.zip]
> [^ZipTest.zip]
> Here's the relevant code (kotlin) :
>
> {code:java}
> thread {
> try {
> val file = File("/storage/emulated/0/test.zip")
> ZipArchiveInputStream(FileInputStream(file)).use {
> while (true) {
> val entry = it.nextEntry ?: break
> Log.d("AppLog", "entry:${entry.name} ${entry.size} ")
> }
> }
> Log.d("AppLog", "got archive ")
> } catch (e: Exception) {
> Log.d("AppLog", "Error:$e")
> e.printStackTrace()
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)