[ 
https://issues.apache.org/jira/browse/COMPRESS-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18068608#comment-18068608
 ] 

Mehmet Karaman edited comment on COMPRESS-722 at 3/26/26 11:52 AM:
-------------------------------------------------------------------

It is difficult to clearly understand the differences between the two source 
code snippets, so let us explicitly clarify them in order to address all 
identified issues:
 * *{{ZstdCompressorOutputStream.flush()}}*
Must be called; otherwise remaining buffered bytes are not written to the 
underlying ZIP stream, resulting in an incomplete ZSTD frame.

 * *{{ZipArchiveEntry.setSize(...)}}*
Must be set; otherwise decompression fails because the ZIP entry metadata does 
not contain the correct uncompressed size.

 * *{{ZstdCompressorOutputStream.close()}}*
Closes the underlying {{{}ZipArchiveOutputStream{}}}, which makes it unusable 
for writing subsequent ZIP entries. A wrapper such as {{noClose()}} is 
therefore required.

 

[~uwevoigt]: Could you please review this list to ensure that all relevant 
problems have been captured correctly?


[~ggregory]: Could you please decide which of these items should be classified 
as bugs (as opposed to usage requirements or API limitations)? After that I 
would like to start further investigations and provide the fixes.


was (Author: JIRAUSER308470):
It is difficult to clearly understand the differences between the two source 
code snippets, so let us explicitly clarify them in order to address all 
identified issues:
 * *{{ZstdCompressorOutputStream.flush()}}*
Must be called; otherwise remaining buffered bytes are not written to the 
underlying ZIP stream, resulting in an incomplete ZSTD frame.

 * *{{ZipArchiveEntry.setSize(...)}}*
Must be set; otherwise decompression fails because the ZIP entry metadata does 
not contain the correct uncompressed size.

 * *{{ZstdCompressorOutputStream.close()}}*
Closes the underlying {{{}ZipArchiveOutputStream{}}}, which makes it unusable 
for writing subsequent ZIP entries. A wrapper such as {{noClose()}} is 
therefore required.

[~uwevoigt]: Could you please review this list to ensure that all relevant 
problems have been captured correctly?
[~ggregory]: Could you please decide which of these items should be classified 
as bugs (as opposed to usage requirements or API limitations)? After that I 
would like to start further investigations and provide the fixes.

 

 

 

> ZStandard compression does not work correctly in ZipArchiveOutputStream
> -----------------------------------------------------------------------
>
>                 Key: COMPRESS-722
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-722
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers, Compressors
>    Affects Versions: 1.28.0
>            Reporter: Uwe Voigt
>            Priority: Major
>
> Hi,
>  
> When writing a using ZSD compression this leads to corrupt archive.
> The expectation is (correct me if I am wrong) that the following code should 
> produce functional archive:
>  
> {code:java}
> @Test
> public void doSave() throws Exception {
>   try (ZipArchiveOutputStream out = new ZipArchiveOutputStream(new 
> File("test-zstd.zip"))) {
>     ZipArchiveEntry e1 = new ZipArchiveEntry("one.txt");
>     e1.setMethod(ZipMethod.ZSTD.getCode());
>     out.putArchiveEntry(e1);
>     out.write("""
>       text file
>       with multiple line
>     """.getBytes());
>     out.closeArchiveEntry();
>     ZipArchiveEntry e2 = new ZipArchiveEntry("two.txt");
>     e2.setMethod(ZipMethod.ZSTD.getCode());
>     out.putArchiveEntry(e2);
>     out.write("""
>       another text file
>       some more line
>     """.getBytes());
>     out.closeArchiveEntry();
>   }
>   org.apache.commons.compress.archivers.zip.ZipFile zipFile = 
> org.apache.commons.compress.archivers.zip.ZipFile.builder().setFile("test-zstd.zip").get();
>   Enumeration<ZipArchiveEntry> entries = zipFile.getEntries();
>   ZipArchiveEntry lastEntry = null;
>   while (entries.hasMoreElements()) {
>     ZipArchiveEntry entry = entries.nextElement();
>     System.out.println(entry.getName() + " / " + entry.getSize() + " / " + 
> entry.getCompressedSize());
>     if (!entries.hasMoreElements())
>       lastEntry = entry;
>   }
>   try (InputStream in = new 
> ZstdCompressorInputStream(zipFile.getInputStream(lastEntry))) {
>     System.out.println(new String(in.readAllBytes()));
>   }
> }
> {code}
> Instead, an exception is thrown:
>  
> {{}}
> {code:java}
> com.github.luben.zstd.ZstdIOException: Truncated source at 
> com.github.luben.zstd.ZstdInputStreamNoFinalizer.readInternal(ZstdInputStreamNoFinalizer.java:174)
>  at 
> com.github.luben.zstd.ZstdInputStreamNoFinalizer.read(ZstdInputStreamNoFinalizer.java:136)
>  at com.github.luben.zstd.ZstdInputStream.read(ZstdInputStream.java:103) at 
> org.apache.commons.compress.compressors.zstandard.ZstdCompressorInputStream.read(ZstdCompressorInputStream.java:119)
>  at java.base/java.io.InputStream.readNBytes(InputStream.java:412) at 
> java.base/java.io.InputStream.readAllBytes(InputStream.java:349) at 
> zipeditor.Zip64Test.doSave(Zip64Test.java:123) ...{code}
> The only working code would be this 
>  
>  
> {code:java}
> publicstatic OutputStream noClose(OutputStream out) {
> return new OutputStream() {
> @Override
> public void write(int b) throws IOException {
> out.write(b);
> }
>  
> @Override
> public void write(byte[] b, int off, int len) throws IOException {
> out.write(b, off, len);
> }
>  
> @Override
> public void flush() throws IOException {
> out.flush();
> }
>  
> @Override
> public void close() throws IOException {
> // do nothing
> }
> };
>  
> }
>  
> @Test
> public void doSave() throws Exception {
> try (ZipArchiveOutputStream out = new ZipArchiveOutputStream(new 
> File("test-zstd.zip"))) {
> ZipArchiveEntry e1 = new ZipArchiveEntry("one.txt");
> e1.setMethod(ZipMethod.ZSTD.getCode());
> byte[] bytes = """
> text file
> with multiple line
> """.getBytes();
> e1.setSize(bytes.length);
> out.putArchiveEntry(e1);
> try (OutputStream zstdOutput = 
> ZstdCompressorOutputStream.builder().setOutputStream(noClose(out)).get()) {
> zstdOutput.write(bytes);
> zstdOutput.flush();
> }
> out.closeArchiveEntry();
> ZipArchiveEntry e2 = new ZipArchiveEntry("two.txt");
> e2.setMethod(ZipMethod.ZSTD.getCode());
> String second = """
> another text file
> some more line
> """;
> e2.setSize(second.getBytes().length);
> out.putArchiveEntry(e2);
> try (OutputStream o2 = 
> ZstdCompressorOutputStream.builder().setOutputStream(noClose(out)).get()) {
> o2.write(second.getBytes());
> o2.flush();
> }
> out.closeArchiveEntry();
> }
>  
> boolean jni = true;
> org.apache.commons.compress.archivers.zip.ZipFile zipFile = 
> org.apache.commons.compress.archivers.zip.ZipFile
> .builder().setZstdInputStreamFactory((inpStream) -> jni ? new 
> ZstdInputStreamNoFinalizer(inpStream) : new 
> io.airlift.compress.zstd.ZstdInputStream(inpStream))
> .setFile("test-zstd.zip").get();
> Enumeration<ZipArchiveEntry> entries = zipFile.getEntries();
> ZipArchiveEntry lastEntry = null;
> while (entries.hasMoreElements()) {
> ZipArchiveEntry entry = entries.nextElement();
> System.out.println(entry.getName() + " / " + entry.getSize() + " / " + 
> entry.getCompressedSize());
> if (!entries.hasMoreElements())
> lastEntry = entry;
> }
> InputStream in = zipFile.getInputStream(lastEntry);
> System.out.println(new String(in.readAllBytes()));
>  
> zipFile.close();
> }
> {code}
>  
>  
>  
>  
> {color:#000000}I think it should not be left to the caller of the API to 
> create such a lot of code just to workaround something that should be done by 
> the library.{color}
>  
> {color:#000000}Can you please investigate and hopefully fix that 
> behavior.{color}
>  
> {color:#000000}Best regards,{color}
> {color:#000000}Uwe{color}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to