[
https://issues.apache.org/jira/browse/COMPRESS-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641106#comment-16641106
]
Stefan Bodewig commented on COMPRESS-466:
-----------------------------------------
I'm afraid there is more to it. In your version {{ZipFile.getEntry}} is going
to always return {{null}} as {{nameMap}} hasn't been populated and we need a
few more adjustments. I'm using your patch as a starting point for adding the
ability to skip parsing of the local directory entries when you know you don't
need the extra field data of the local file header.
We do know the total size of the central directory as it is stored inside the
"End of central directory record" or "Zip64 end of central directory record".
Things might get faster if we read things in one go, but we'd probably want to
measure whether the difference is actually significant (and it would be a
different issue :) )
> Opening of a very large zip file is extremely slow compared to
> java.util.zip.ZipFile
> ------------------------------------------------------------------------------------
>
> Key: COMPRESS-466
> URL: https://issues.apache.org/jira/browse/COMPRESS-466
> Project: Commons Compress
> Issue Type: Bug
> Components: Compressors
> Affects Versions: 1.18
> Environment: Tested both on Linux and OSX 10.13.6.
> Reporter: Jakob Sultan Ericsson
> Priority: Major
>
> We have a quite large zip file 35 gb and try to open this with ZipFile.
> {code:java}
> try (ZipFile zf = new ZipFile(new File("35gb.zip"))) {
> System.out.println("File opened..." + (System.currentTimeMillis()
> - start));
> }
> {code}
> This code takes about 300 000 - 400 000 ms (5-6 minutes).
> If I run this with JDK-builtin java.util.zip.ZipFile, same code takes 300 ms
> (less than a second).
> I'm not totally sure what it is the problem but I did some debugging and
> basically all time is spent in
> {code:java}
> private void resolveLocalFileHeaderData(final Map<ZipArchiveEntry,
> NameAndComment> entriesWithoutUTF8Flag)
> {code}
> Anything that can be done to improve this?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)