[ 
https://issues.apache.org/jira/browse/COMPRESS-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487906#comment-17487906
 ] 

Peter Lee commented on COMPRESS-607:
------------------------------------

For archives stored in disk, you can choose ZipFile and ZipArchiveInputStream 
as you needed. But ZipArchiveInputStream is designed to work in memory. IMO a 
4GB limit is reasonable - it's the limit of array size in Java.

 

Seems we need to document this limitation.

> ZipArchiveInputStream: Large STORED entry leads to OutOfMemory
> --------------------------------------------------------------
>
>                 Key: COMPRESS-607
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-607
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.21
>            Reporter: Robin Schimpf
>            Priority: Major
>
> While extracting a large Zip with only STORED entries a file larger than 4GB 
> triggered an OutOfMemory error without the JVM having exhausted the available 
> memory.
> {code:java}
> Caused by: java.lang.OutOfMemoryError
>         at 
> java.base/java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:125)
>         at 
> java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:119)
>         at 
> java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)
>         at 
> java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)
>         at 
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.cacheBytesRead(ZipArchiveInputStream.java:1086)
>         at 
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.readStoredEntry(ZipArchiveInputStream.java:1015)
>         at 
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.readStored(ZipArchiveInputStream.java:588)
>         at 
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:526){code}
> The stream seems to buffer the whole entry in memory until it finds the next 
> entry. Since the buffer is stored in an ByteArrayOutputStream only 
> Integer.MAX bytes can be buffered.
> Is this limitation intended? I have read 
> [https://commons.apache.org/proper/commons-compress/zip.html#ZipArchiveInputStream_vs_ZipFile]
>  but found nothing about file size limitation.
> Since the file is stored on disk I will switch to ZipFile but for other cases 
> it would be preferrable to be able to extract such files.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to