[ 
https://issues.apache.org/jira/browse/COMPRESS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roland Kreuzer updated COMPRESS-592:
------------------------------------
    Description: 
I have a use-case where I have to decompress Sevenzip archives from an external 
source which may have a large number of entries.

I found decompression fails when trying to extract entry 65536 (zero-based 
index) with a checksum failure.

 

I was able to reproduce the issue with a simple 7Zip file containing 70.001 
entries with random MD5 checksum textfiles (attached).

The sample Archive was created using the 7Zip Windows client and uses LZMA2:3m.

 

My code is a simple sequential read of all contents of the file like
{code:java}
    @Test
    void readBigSevenZipFile() throws IOException
    {
        try (SevenZFile sevenZFile = new SevenZFile(new 
File("E:\\Temp\\0000_DOC.7z")))
        {
            SevenZArchiveEntry entry = sevenZFile.getNextEntry();
            while (entry != null)
            {
                if (entry.hasStream())
                {
                    byte[] content = new byte[(int) entry.getSize()];
                    sevenZFile.read(content);

                    System.out.println(entry.getName());
                }

                entry = sevenZFile.getNextEntry();
            }
        }
    }
{code}
which fails consistently after file65535.txt with
{code:java}
java.io.IOException: Checksum verification failed
        at 
org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
 ~[commons-compress-1.21.jar!/:1.21]
        at 
org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1905)
 ~[commons-compress-1.21.jar!/:1.21]
        at 
org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1888)
 ~[commons-compress-1.21.jar!/:1.21]
{code}
 

It is noticeable that the value is 2 to the 16th power, which could suggest an 
overflow error of some sorts.

 

While the minimum sample contains only small txt files, I originally found the 
issue with larger archives containing also Image and PDF files. The archive's 
contents or size in byte does not seem to have direct influence on the issue, 
only the number of files contained within.

 

I did not find any workaround yet.

  was:
I have a use-case where I have to decompress Sevenzip archives from an external 
source which may have a large number of entries.

I found decompression fails when trying to extract entry 65536 (zero-based 
index) with a checksum failure.

 

I was able to reproduce the issue with a simple 7Zip file containing 70.001 
entries with random MD5 checksum textfiles (attached).

The sample Archive was created using the 7Zip Windows client and uses LZMA2:3m.

 

My code is a simple sequential read of all contents of the file like
{code:java}
    @Test
    void readBigSevenZipFile() throws IOException
    {
        int index = 0;

        try (SevenZFile sevenZFile = new SevenZFile(new 
File("E:\\Temp\\0000_DOC.7z")))
        {
            SevenZArchiveEntry entry = sevenZFile.getNextEntry();
            while (entry != null)
            {
                if (entry.hasStream())
                {
                    byte[] content = new byte[Math.toIntExact(entry.getSize())];
                    sevenZFile.read(content);

                    System.out.print(index);
                    System.out.print("\t");
                    System.out.println(entry.getName());
                    index++;
                }

                entry = sevenZFile.getNextEntry();
            }
        }
    }
{code}
which fails consistently after index 65535 with
{code:java}
java.io.IOException: Checksum verification failed
        at 
org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
 ~[commons-compress-1.21.jar!/:1.21]
        at 
org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1905)
 ~[commons-compress-1.21.jar!/:1.21]
        at 
org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1888)
 ~[commons-compress-1.21.jar!/:1.21]
{code}
 

It is noticeable that the value is 2 to the 16th power, which could suggest an 
overflow error of some sorts.

 

I did not find any workaround yet.


> Checksum verification failed on SevenZip archive containing more than 65536 
> entries
> -----------------------------------------------------------------------------------
>
>                 Key: COMPRESS-592
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-592
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Compressors
>    Affects Versions: 1.21
>         Environment: Compress 1.21 and XZ 1.9 on JDK 11
>            Reporter: Roland Kreuzer
>            Priority: Major
>         Attachments: 0000_DOC.7z
>
>
> I have a use-case where I have to decompress Sevenzip archives from an 
> external source which may have a large number of entries.
> I found decompression fails when trying to extract entry 65536 (zero-based 
> index) with a checksum failure.
>  
> I was able to reproduce the issue with a simple 7Zip file containing 70.001 
> entries with random MD5 checksum textfiles (attached).
> The sample Archive was created using the 7Zip Windows client and uses 
> LZMA2:3m.
>  
> My code is a simple sequential read of all contents of the file like
> {code:java}
>     @Test
>     void readBigSevenZipFile() throws IOException
>     {
>         try (SevenZFile sevenZFile = new SevenZFile(new 
> File("E:\\Temp\\0000_DOC.7z")))
>         {
>             SevenZArchiveEntry entry = sevenZFile.getNextEntry();
>             while (entry != null)
>             {
>                 if (entry.hasStream())
>                 {
>                     byte[] content = new byte[(int) entry.getSize()];
>                     sevenZFile.read(content);
>                     System.out.println(entry.getName());
>                 }
>                 entry = sevenZFile.getNextEntry();
>             }
>         }
>     }
> {code}
> which fails consistently after file65535.txt with
> {code:java}
> java.io.IOException: Checksum verification failed
>         at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>  ~[commons-compress-1.21.jar!/:1.21]
>         at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1905)
>  ~[commons-compress-1.21.jar!/:1.21]
>         at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1888)
>  ~[commons-compress-1.21.jar!/:1.21]
> {code}
>  
> It is noticeable that the value is 2 to the 16th power, which could suggest 
> an overflow error of some sorts.
>  
> While the minimum sample contains only small txt files, I originally found 
> the issue with larger archives containing also Image and PDF files. The 
> archive's contents or size in byte does not seem to have direct influence on 
> the issue, only the number of files contained within.
>  
> I did not find any workaround yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to