[
https://issues.apache.org/jira/browse/COMPRESS-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820426#comment-17820426
]
Cosmin Carabet edited comment on COMPRESS-666 at 2/24/24 9:33 PM:
------------------------------------------------------------------
The following compiles for me with Java8
[https://github.com/apache/commons-compress/pull/488]
It looks like CI doesn't run for forks. But hopefully, you can use this as a
starting point
was (Author: JIRAUSER304399):
The following compiles for me with Java8
[https://github.com/apache/commons-compress/pull/488]
> Commons compress 1.26.0 gives unexpected Corrupted TAR archive
> --------------------------------------------------------------
>
> Key: COMPRESS-666
> URL: https://issues.apache.org/jira/browse/COMPRESS-666
> Project: Commons Compress
> Issue Type: Bug
> Environment: Commons compress 1.26.0 to get a failure. Any tar tgz.
> Reporter: Cosmin Carabet
> Priority: Major
>
> Something in
> [https://github.com/apache/commons-compress/compare/rel/commons-compress-1.25.0...master]
> seems to make iterating through the tar entries of multiple
> TarArchiveInputStreams throw Corrupted TAR archive:
>
> {code:java}
> @Test
> void bla() {
> ExecutorService executorService = Executors.newFixedThreadPool(10);
> List<CompletableFuture<Void>> tasks = IntStream.range(0, 200)
> .mapToObj(_idx -> CompletableFuture.runAsync(
> () -> {
> try (InputStream inputStream = this.getClass()
> .getResourceAsStream(
> "/<your favourite tar tgz>");
> TarArchiveInputStream tarInputStream =
> new TarArchiveInputStream(new
> GZIPInputStream(inputStream))) {
> TarArchiveEntry tarEntry;
> while ((tarEntry =
> tarInputStream.getNextTarEntry()) != null) {
> System.out.println("Reading entry %s with
> size %d"
> .formatted(tarEntry.getName(),
> tarEntry.getSize()));
> }
> } catch (Exception ex) {
> throw new SafeRuntimeException(ex);
> }
> },
> executorService))
> .toList();
>
> Futures.getUnchecked(CompletableFuture.allOf(verificationTasks.toArray(new
> CompletableFuture<?>[0])));
> } {code}
> Although TarArchiveInputStream is marked as not thread safe, I am not reusing
> objects here. Those are in fact separate objects, presumably all with their
> own position tracking info.
>
> The stacktrace here looks like:
> {code:java}
> Caused by: java.io.IOException: Corrupted TAR archive.
> at
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1480)
> at
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.<init>(TarArchiveEntry.java:534)
> at
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:431)
> at
> Caused by: java.lang.IllegalArgumentException: Invalid byte 100 at offset 0
> in 'dddddddddddd' len=12
> at
> org.apache.commons.compress.archivers.tar.TarUtils.parseOctal(TarUtils.java:516)
> at
> org.apache.commons.compress.archivers.tar.TarUtils.parseOctalOrBinary(TarUtils.java:540)
> at
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeaderUnwrapped(TarArchiveEntry.java:1496)
> at
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1478)
> ... 7 more
> {code}
> That code shows that occasionally the header is wrong (the tar entry name
> contains gibberish bits) which makes me think that `getNextTarEntry()` can be
> faulty.
>
> Running that code with commons compress 1.25.0 works as expected. So it's
> probably something added since November. Note that this is something related
> to parallelism - using an executor service with a single thread doesn't
> suffer from the same error. The tgz to decompress doesn't really matter - you
> can use a manually created one worth a few KBs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)