[jira] [Updated] (COMPRESS-643) ZipArchiveInputStream.getCompressedCount is not calculated properly

Zsolt F (Jira) Thu, 04 May 2023 01:10:07 -0700


     [ 
https://issues.apache.org/jira/browse/COMPRESS-643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Zsolt F updated COMPRESS-643:
-----------------------------
    Description: 
h2. Context

During iterating over the zip entries by using ZipArchiveInputStream the 
provided 

getCompressedCount  and getUncompressedCount methods do not return the correct 
values when the stream content is not fully read.
h2. Demo

The zip file used in the code snippets attached to the jira.

*Good behaviour*

Executing the follow code  working as expected:

 
{code:java}
final ZipArchiveInputStream stream = new ZipArchiveInputStream(new 
FileInputStream("test.zip"));
while (true)
{
    final ZipArchiveEntry nextZipEntry = stream.getNextZipEntry();
    if (null == nextZipEntry)
    {
        break;
    }
    //reading all the content
    stream.readAllBytes();

    System.out.println(String.format("[%s] compressed size: [%d] uncompressed 
size: [%d], calculated ratio: [%.2f]",
            nextZipEntry.getName(),
            stream.getCompressedCount(),
            stream.getUncompressedCount(),
            (double) stream.getCompressedCount() / 
stream.getUncompressedCount()));

} {code}
Output:
{code:java}
[first.xml] compressed size: [475830] uncompressed size: [16239665], calculated 
ratio: [0.03]
[last.xml] compressed size: [2221] uncompressed size: [45481], calculated 
ratio: [0.05] {code}
*Bad behaviour*

The next code snippet doesn't read the second entry fully only 16 bytes, and in 
this case the calculated values are wrong.
{code:java}
final ZipArchiveInputStream stream = new ZipArchiveInputStream(new 
FileInputStream("test.zip"));
while (true)
{
    final ZipArchiveEntry nextZipEntry = stream.getNextZipEntry();
    if (null == nextZipEntry)
    {
        break;
    }
    //reading fully only the last entry
    if ("first.xml".equals(nextZipEntry.getName()))
    {
        stream.readAllBytes();
    }
    else
    {
        stream.readNBytes(16);
    }

    System.out.println(String.format("[%s] compressed size: [%d] uncompressed 
size: [%d], calculated ratio: [%.2f]",
            nextZipEntry.getName(),
            stream.getCompressedCount(),
            stream.getUncompressedCount(),
            (double) stream.getCompressedCount() / 
stream.getUncompressedCount()));

} {code}
Output:
{code:java}
[first.xml] compressed size: [475830] uncompressed size: [16239665], calculated 
ratio: [0.03]
[last.xml] compressed size: [81] uncompressed size: [16], calculated ratio: 
[5.06] {code}
The calculated ratio is wrong the last.xml due to the compressed size and 
uncompressed size is wrong. 

 

This issue is reproducible in case of  iterating over the zip entries and read 
the content only for the last entry.

 

 

  was:
h2. Context

During iterating over the zip entries by using ZipArchiveInputStream the 
provided 

getCompressedCount  and getUncompressedCount methods do not return the correct 
values when the stream content is not fully read.
h2. Demo

The zip file used in the code snippets attached to the jira.

*Good behaviour*

Executing the follow code  working as expected:

 
{code:java}
final ZipArchiveInputStream stream = new ZipArchiveInputStream(new 
FileInputStream("test.zip"));
while (true)
{
    final ZipArchiveEntry nextZipEntry = stream.getNextZipEntry();
    if (null == nextZipEntry)
    {
        break;
    }
    //reading all the content
    stream.readAllBytes();

    System.out.println(String.format("[%s] compressed size: [%d] uncompressed 
size: [%d], calculated ratio: [%.2f]",
            nextZipEntry.getName(),
            stream.getCompressedCount(),
            stream.getUncompressedCount(),
            (double) stream.getCompressedCount() / 
stream.getUncompressedCount()));

} {code}
Procced output:

 
{code:java}
[first.xml] compressed size: [475830] uncompressed size: [16239665], calculated 
ratio: [0.03]
[last.xml] compressed size: [2221] uncompressed size: [45481], calculated 
ratio: [0.05] {code}
*Bad behaviour*

The next code snippet doesn't read the second entry fully only 16 bytes, and in 
this case the calculated values are wrong.
{code:java}
final ZipArchiveInputStream stream = new ZipArchiveInputStream(new 
FileInputStream("test.zip"));
while (true)
{
    final ZipArchiveEntry nextZipEntry = stream.getNextZipEntry();
    if (null == nextZipEntry)
    {
        break;
    }
    //reading fully only the last entry
    if ("first.xml".equals(nextZipEntry.getName()))
    {
        stream.readAllBytes();
    }
    else
    {
        stream.readNBytes(16);
    }

    System.out.println(String.format("[%s] compressed size: [%d] uncompressed 
size: [%d], calculated ratio: [%.2f]",
            nextZipEntry.getName(),
            stream.getCompressedCount(),
            stream.getUncompressedCount(),
            (double) stream.getCompressedCount() / 
stream.getUncompressedCount()));

} {code}
Output:
{code:java}
[first.xml] compressed size: [475830] uncompressed size: [16239665], calculated 
ratio: [0.03]
[last.xml] compressed size: [81] uncompressed size: [16], calculated ratio: 
[5.06] {code}
The calculated ratio is wrong the last.xml due to the compressed size and 
uncompressed size is wrong. 

 

This issue is reproducible in case of  iterating over the zip entries and read 
the content only for the last entry.

 

 


> ZipArchiveInputStream.getCompressedCount is not calculated properly
> -------------------------------------------------------------------
>
>                 Key: COMPRESS-643
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-643
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Compressors
>    Affects Versions: 1.21, 1.23.0
>            Reporter: Zsolt F
>            Priority: Major
>             Fix For: 2.0
>
>         Attachments: test.zip
>
>
> h2. Context
> During iterating over the zip entries by using ZipArchiveInputStream the 
> provided 
> getCompressedCount  and getUncompressedCount methods do not return the 
> correct values when the stream content is not fully read.
> h2. Demo
> The zip file used in the code snippets attached to the jira.
> *Good behaviour*
> Executing the follow code  working as expected:
>  
> {code:java}
> final ZipArchiveInputStream stream = new ZipArchiveInputStream(new 
> FileInputStream("test.zip"));
> while (true)
> {
>     final ZipArchiveEntry nextZipEntry = stream.getNextZipEntry();
>     if (null == nextZipEntry)
>     {
>         break;
>     }
>     //reading all the content
>     stream.readAllBytes();
>     System.out.println(String.format("[%s] compressed size: [%d] uncompressed 
> size: [%d], calculated ratio: [%.2f]",
>             nextZipEntry.getName(),
>             stream.getCompressedCount(),
>             stream.getUncompressedCount(),
>             (double) stream.getCompressedCount() / 
> stream.getUncompressedCount()));
> } {code}
> Output:
> {code:java}
> [first.xml] compressed size: [475830] uncompressed size: [16239665], 
> calculated ratio: [0.03]
> [last.xml] compressed size: [2221] uncompressed size: [45481], calculated 
> ratio: [0.05] {code}
> *Bad behaviour*
> The next code snippet doesn't read the second entry fully only 16 bytes, and 
> in this case the calculated values are wrong.
> {code:java}
> final ZipArchiveInputStream stream = new ZipArchiveInputStream(new 
> FileInputStream("test.zip"));
> while (true)
> {
>     final ZipArchiveEntry nextZipEntry = stream.getNextZipEntry();
>     if (null == nextZipEntry)
>     {
>         break;
>     }
>     //reading fully only the last entry
>     if ("first.xml".equals(nextZipEntry.getName()))
>     {
>         stream.readAllBytes();
>     }
>     else
>     {
>         stream.readNBytes(16);
>     }
>     System.out.println(String.format("[%s] compressed size: [%d] uncompressed 
> size: [%d], calculated ratio: [%.2f]",
>             nextZipEntry.getName(),
>             stream.getCompressedCount(),
>             stream.getUncompressedCount(),
>             (double) stream.getCompressedCount() / 
> stream.getUncompressedCount()));
> } {code}
> Output:
> {code:java}
> [first.xml] compressed size: [475830] uncompressed size: [16239665], 
> calculated ratio: [0.03]
> [last.xml] compressed size: [81] uncompressed size: [16], calculated ratio: 
> [5.06] {code}
> The calculated ratio is wrong the last.xml due to the compressed size and 
> uncompressed size is wrong. 
>  
> This issue is reproducible in case of  iterating over the zip entries and 
> read the content only for the last entry.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (COMPRESS-643) ZipArchiveInputStream.getCompressedCount is not calculated properly

Reply via email to