[
https://issues.apache.org/jira/browse/COMPRESS-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844036#comment-13844036
]
Julien Aymé commented on COMPRESS-248:
--------------------------------------
Hello Jing, thanks for the report.
May I ask you what is thrown when trying to extract the third file using
java.util.zip.GZIPInputStream?
{code}
File corruptFile = ...; // The corrupt gz file
InputStream in = new GZIPInputStream(new FileInputStream(corruptFile));
byte[] buf = new byte[4096];
long total = 0;
int read;
while ((read = in.read(buf)) != -1) {
total += read;
}
{code}
If the same OOME is thrown, then I suggest opening a bug at Oracle
(http://bugs.sun.com/bugdatabase/).
If the corrupt file is not sensitive and not too big, could you attach it to
the issue ?
Thanks in advance,
regards,
Julien
> Naive OOM when deal with a corrupt .gz file
> -------------------------------------------
>
> Key: COMPRESS-248
> URL: https://issues.apache.org/jira/browse/COMPRESS-248
> Project: Commons Compress
> Issue Type: Bug
> Components: Compressors
> Affects Versions: 1.6
> Environment: Fedora 19 x86_64, 8G RAM, Java version "1.7.0_45"
> OpenJDK Runtime Environment (fedora-2.4.3.0.fc19-x86_64 u45-b15)
> OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)
> Reporter: Jing Li
>
> I tried to extract three gz files, and they are corrupt. The two of them at
> ahead throw the IOExceptions:
> Caused by: java.io.IOException: Gzip-compressed data is corrupt
> at
> org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.read(GzipCompressorInputStream.java:253)
> at java.io.InputStream.read(InputStream.java:82)
> ...
> But when comes to the third one it throw out OOM as below:
> java.lang.OutOfMemoryError
> at java.util.zip.Inflater.inflateBytes(Native Method)
> at java.util.zip.Inflater.inflate(Inflater.java:238)
> at
> org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.read(GzipCompressorInputStream.java:251)
> at java.io.InputStream.read(InputStream.java:82)
> The third file is corrupt, but Linux recognize it as a compressed gz file.
> More info:
> [jing@localhost logs]$ file stdout.log.txt.gz
> stdout.log.txt.gz: gzip compressed data, was "stdout.log_backup", from Unix,
> last modified: Tue Nov 19 22:53:19 2013
> [jing@localhost logs]$ tar -xvzf stdout.log.txt.gz
> gzip: stdin: invalid compressed data--format violated
> tar: Child returned status 1
> tar: Error is not recoverable: exiting now
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)