[ 
https://issues.apache.org/jira/browse/COMPRESS-654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813198#comment-17813198
 ] 

Gary D. Gregory commented on COMPRESS-654:
------------------------------------------

[~wyverald]
>From the dev mailing list at 
>https://lists.apache.org/thread/c8k9cgoko3d1xntjnscq9s7fhz85x276 :

{noformat}
---------- Forwarded message ---------
From: Peter Hull <[email protected]>
Date: Wed, Jan 31, 2024 at 8:34 AM
Subject: Re: [COMPRESS] Decompress BZIP2 File Max Output is 900000 chars
To: Commons Developers List <[email protected]>


I can't add to the JIRA bug but I had a quick play on WSL (debian),
Java 21, compress 1.25.0 and found:
Using dd if=/dev/random I could create a big file, compress it with
bzip2 and then decompress it with BZip2CompressorInputStream , no
problems
Same file compressed with pbzip2 was truncated at 900000 as described.
Those 900000 bytes were just the first 900000 bytes of the correct output
So it is pbzip2 vs bzip2, nothing to do with tar files.

Description for BZip2CompressorInputStream
(https://commons.apache.org/proper/commons-compress/apidocs/org/apache/commons/compress/compressors/bzip2/BZip2CompressorInputStream.html)
says there is another constructor with a boolean flag for
decompressing concatenated files.

Using this constructor appears to work OK.

Therefore I assume that pbzip2 creates concatenated bzip files?

Hope that helps
Peter
{noformat}


> Issue extracting certain sparse tarballs
> ----------------------------------------
>
>                 Key: COMPRESS-654
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-654
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.25.0
>            Reporter: Xudong Yang
>            Priority: Major
>         Attachments: ruff-aarch64-apple-darwin.tar.gz
>
>
> We maintain the Bazel ([https://bazel.build/]) build system, which uses 
> Apache Commons Compress to handle archive extraction. A user reported that a 
> certain sparse tarball always triggers an error 
> ([https://github.com/bazelbuild/bazel/issues/20269#issuecomment-1821250607]), 
> and the steps to reproduce the error are very simple:
>  
> {{#!/usr/bin/env bash}}
> {{set -o errexit -o nounset}}
> {{echo "Downloading commons-compress"}}
> {{wget 
> [https://repo1.maven.org/maven2/org/apache/commons/commons-compress/1.25.0/commons-compress-1.25.0.jar]}}
> {{echo "Downloading sample sparse archive"}}
> {{wget 
> [https://github.com/astral-sh/ruff/releases/download/v0.1.6/ruff-aarch64-apple-darwin.tar.gz]}}
> {{gunzip ruff-aarch64-apple-darwin.tar.gz}}
> {{echo "Testing with system tar"}}
> {{tar -tf ruff-aarch64-apple-darwin.tar}}
> {{echo "Testing with commons-compress"}}
> {{java -jar commons-compress-1.25.0.jar ruff-aarch64-apple-darwin.tar}}
>  
> Output:
>  
> {{Testing with system tar}}
> {{ruff}}
> {{Testing with commons-compress}}
> {{Analysing ruff-aarch64-apple-darwin.tar}}
> {{Created 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream@17f052a3}}
> {{ruff}}
> {{Exception in thread "main" java.io.IOException: Truncated TAR archive}}
> {{at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.read(TarArchiveInputStream.java:694)}}
> {{at org.apache.commons.compress.utils.IOUtils.readFully(IOUtils.java:244)}}
> {{at org.apache.commons.compress.utils.IOUtils.skip(IOUtils.java:355)}}
> {{at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:451)}}
> {{at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextEntry(TarArchiveInputStream.java:426)}}
> {{at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextEntry(TarArchiveInputStream.java:50)}}
> {{at org.apache.commons.compress.archivers.Lister.listStream(Lister.java:79)}}
> {{at org.apache.commons.compress.archivers.Lister.main(Lister.java:133)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to