BZip2CompressorInputStream truncates files compressed with pbzip2
-----------------------------------------------------------------
Key: COMPRESS-185
URL: https://issues.apache.org/jira/browse/COMPRESS-185
Project: Commons Compress
Issue Type: Bug
Components: Compressors
Affects Versions: 1.3
Reporter: Karsten Loesing
Fix For: 1.4
I'm using BZip2CompressorInputStream in Compress 1.3 to decompress a file that
was created with pbzip2 1.1.6 (http://compression.ca/pbzip2/). The stream ends
early after 900000 bytes, truncating the rest of the pbzip2-compressed file.
Decompressing the file with bunzip2 or compressing the original file with bzip2
both fix the issue. I think both pbzip2 and Compress are to blame here: pbzip2
apparently does something non-standard when compressing files, and Compress
should handle the non-standard format rather than pretending to be done
decompressing. Another option is that I'm doing something wrong; in that case
please let me know! :)
Here's how the problem can be reproduced:
1. Generate a file that's 900000+ bytes large: dd if=/dev/zero of=1mbfile
count=1 bs=1M
2. Compress with pbzip2: pbzip2 1mbfile
3. Decompress with Bunzip2 class below
4. Notice how the resulting 1mbfile is 900000 bytes large, not 1M.
Now compare to using bunzip2/bzip2:
- Do the steps above, but instead of 2, compress with bzip2: bzip2 1mbfile
- Do the steps above, but instead of 3, decompress with bunzip2: bunzip2
1mbfile.bz2
import java.io.*;
import org.apache.commons.compress.compressors.bzip2.*;
public class Bunzip2 {
public static void main(String[] args) throws Exception {
File inFile = new File(args[0]);
File outFile = new File(args[0].substring(0, args[0].length() - 4));
FileInputStream fis = new FileInputStream(inFile);
BZip2CompressorInputStream bz2cis =
new BZip2CompressorInputStream(fis);
BufferedInputStream bis = new BufferedInputStream(bz2cis);
BufferedOutputStream bos = new BufferedOutputStream(
new FileOutputStream(outFile));
int len;
byte[] data = new byte[1024];
while ((len = bis.read(data, 0, 1024)) >= 0) {
bos.write(data, 0, len);
}
bos.close();
bis.close();
}
}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira