[ https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682209#comment-16682209 ]
Stefan Bodewig commented on COMPRESS-446: ----------------------------------------- The one you are currently using implicitly is https://github.com/apache/commons-compress/blob/master/src/main/java/org/apache/commons/compress/parallel/FileBasedScatterGatherBackingStore.java By using the two-arg constructor of {{ParallelScatterZipCreator}} you can plug in your own {{ScatterGatherBackingStoreSupplier}} which has a single method that is responsible for creating a new {{ScatterGatherBackingStore}}. This is assuming you are creating {{ParallelScatterZipCreator}} yourself and not using a third-party library that abstracts that away from you. > Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream) > -------------------------------------------------------------------------- > > Key: COMPRESS-446 > URL: https://issues.apache.org/jira/browse/COMPRESS-446 > Project: Commons Compress > Issue Type: Bug > Components: Archivers > Affects Versions: 1.16.1 > Environment: The application was running inside a Docker container, > the JVM had about 1.7 GByte heap space. > Reporter: Christoph Ludwig > Priority: Major > Labels: zip > Fix For: 1.17 > > > Before it does anything else, > {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all > futures returned by the creator`s executor service and calls > {{Future#get()}}. This will block until the future's computation is > completed, respectively - i.e., until all entries have been written to the > thread-local scatter streams. > However, if the computation of a future fails, then {{Future#get()}} can also > throw an exception. This exception escapes > {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the > executor service is shut down. The latter means that also the thread-local > variables in the executor service's threads and all objects referenced by > them continue to exist and cannot be reclaimed by the GC. > I encountered this situation when - while processing an archive with 130,000 > documents - the JVM threw an {{OutOfMemoryError}}. The application was not > able to recover from this OOM error because most of the heap was occupied by > objects reachable from the executor service's threads. > Of course, the OOM is mostly the fault of my own code; I will be able to work > around the "leaked" executor service because I supply it in the first place > and can therefore shut it down if I detect an error situation. > The effect would be the same, though, if, say, {{Future#get()}} throws an > {{InterruptedException}}. Therefore, > {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} should either > shut down and release all resources if it cannot complete its task due to an > Exception thrown by a future or it should offer a reasonable recovery > strategy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)