[ 
https://issues.apache.org/jira/browse/COMPRESS-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821707#comment-17821707
 ] 

Gary D. Gregory commented on COMPRESS-666:
------------------------------------------

Hello [~cosmin79],
Yes, this is what I get locally:
{noformat}
org.opentest4j.AssertionFailedError
        at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:46)
        at org.junit.jupiter.api.Assertions.fail(Assertions.java:161)
        at 
org.apache.commons.compress.compressors.GZipTest.testCompress666(GZipTest.java:88)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at java.util.ArrayList.forEach(ArrayList.java:1259)
        at java.util.ArrayList.forEach(ArrayList.java:1259)
Caused by: java.util.concurrent.ExecutionException: 
org.opentest4j.AssertionFailedError: 
org.apache.commons.compress.archivers.tar.TarArchiveEntry@101289c2
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.commons.compress.compressors.GZipTest.testCompress666(GZipTest.java:81)
        ... 3 more
Caused by: org.opentest4j.AssertionFailedError: 
org.apache.commons.compress.archivers.tar.TarArchiveEntry@101289c2
        at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:42)
        at org.junit.jupiter.api.Assertions.fail(Assertions.java:150)
        at 
org.apache.commons.compress.compressors.GZipTest.lambda$1(GZipTest.java:75)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Caused by: java.util.zip.ZipException: Corrupt GZIP trailer
        at java.util.zip.GZIPInputStream.readTrailer(GZIPInputStream.java:225)
        at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:119)
        at org.apache.commons.io.IOUtils.skip(IOUtils.java:2422)
        at org.apache.commons.io.IOUtils.skip(IOUtils.java:2380)
        at 
org.apache.commons.compress.archivers.tar.TarArchiveInputStream.consumeRemainderOfLastBlock(TarArchiveInputStream.java:320)
        at 
org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getRecord(TarArchiveInputStream.java:505)
        at 
org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:418)
        at 
org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextEntry(TarArchiveInputStream.java:392)
        at 
org.apache.commons.compress.compressors.GZipTest.lambda$1(GZipTest.java:71)
        ... 5 more
{noformat}
 

> Multithreaded access to Tar archive throws java.util.zip.ZipException: 
> Corrupt GZIP trailer
> -------------------------------------------------------------------------------------------
>
>                 Key: COMPRESS-666
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-666
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.26.0
>         Environment: Commons compress 1.26.0 to get a failure. Any tar tgz.
>            Reporter: Cosmin Carabet
>            Priority: Major
>
> Something in 
> [https://github.com/apache/commons-compress/compare/rel/commons-compress-1.25.0...master]
>  seems to make iterating through the tar entries of multiple 
> TarArchiveInputStreams throw Corrupted TAR archive:
>  
> {code:java}
> @Test
> void bla() {
>     ExecutorService executorService = Executors.newFixedThreadPool(10);
>     List<CompletableFuture<Void>> tasks = IntStream.range(0, 200)
>             .mapToObj(_idx -> CompletableFuture.runAsync(
>                     () -> {
>                         try (InputStream inputStream = this.getClass()
>                                         .getResourceAsStream(
>                                                 "/<your favourite tar tgz>");
>                                 TarArchiveInputStream tarInputStream =
>                                         new TarArchiveInputStream(new 
> GZIPInputStream(inputStream))) {
>                             TarArchiveEntry tarEntry;
>                             while ((tarEntry = 
> tarInputStream.getNextTarEntry()) != null) {
>                                 System.out.println("Reading entry %s with 
> size %d"
>                                         .formatted(tarEntry.getName(), 
> tarEntry.getSize()));
>                             }
>                         } catch (Exception ex) {
>                             throw new RuntimeException(ex);
>                         }
>                     },
>                     executorService))
>             .toList();
>     Futures.getUnchecked(CompletableFuture.allOf(tasks.toArray(new 
> CompletableFuture<?>[0])));
> } {code}
> Although TarArchiveInputStream is marked as not thread safe, I am not reusing 
> objects here. Those are in fact separate objects, presumably all with their 
> own position tracking info.
>  
> The stacktrace here looks like:
> {code:java}
> Caused by: java.io.IOException: Corrupted TAR archive.
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1480)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.<init>(TarArchiveEntry.java:534)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:431)
>     at
> Caused by: java.lang.IllegalArgumentException: Invalid byte 100 at offset 0 
> in 'dddddddddddd' len=12
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parseOctal(TarUtils.java:516)
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parseOctalOrBinary(TarUtils.java:540)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeaderUnwrapped(TarArchiveEntry.java:1496)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1478)
>     ... 7 more
>  {code}
> That code shows that occasionally the header is wrong (the tar entry name 
> contains gibberish bits) which makes me think that `getNextTarEntry()` can be 
> faulty.
>  
> Running that code with commons compress 1.25.0 works as expected. So it's 
> probably something added since November. Note that this is something related 
> to parallelism - using an executor service with a single thread doesn't 
> suffer from the same error. The tgz to decompress doesn't really matter - you 
> can use a manually created one worth a few KBs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to