[jira] [Commented] (COMPRESS-679) Regression on parallel processing of 7zip files

Jira Thu, 16 May 2024 07:17:10 -0700


    [ 
https://issues.apache.org/jira/browse/COMPRESS-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846969#comment-17846969
 ]


Mikaël MECHOULAM commented on COMPRESS-679:
-------------------------------------------

[~ggregory]  our tests are OK, thank you very much for your very quick fix. For 
the release we're only interested by the approximate date of the final version, 
to know if it will fit with the target release date of our own product. 

> Regression on parallel processing of 7zip files
> -----------------------------------------------
>
>                 Key: COMPRESS-679
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-679
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.26.0, 1.26.1
>            Reporter: Mikaël MECHOULAM
>            Assignee: Gary D. Gregory
>            Priority: Critical
>             Fix For: 1.26.2
>
>         Attachments: file.7z
>
>
> I've run into a bug which occurs when attempting to read a 7zip file in 
> several threads simultaneously.  The following code illustrates the problem. 
> The file.7z is in attachment
>  
> {code:java}
> import java.io.InputStream;
> import java.nio.file.Paths;
> import java.util.stream.IntStream;
> import org.apache.commons.compress.archivers.sevenz.SevenZArchiveEntry;
> import org.apache.commons.compress.archivers.sevenz.SevenZFile;
> public class TestZip {
>     public static void main(final String[] args) {
>         final Runnable runnable = () -> {
>             try {
>                 try (final SevenZFile sevenZFile = 
> SevenZFile.builder().setPath(Paths.get("file.7z")).get()) {
>                     SevenZArchiveEntry sevenZArchiveEntry;
>                     while ((sevenZArchiveEntry = sevenZFile.getNextEntry()) 
> != null) {
>                         if ("file4.txt".equals(sevenZArchiveEntry.getName())) 
> { // The entry must not be the first of the ZIP archive to reproduce
>                             final InputStream inputStream = 
> sevenZFile.getInputStream(sevenZArchiveEntry);
>                             // treatments...
>                             break;
>                         }
>                     }
>                 }
>             } catch (final Exception e) { // java.io.IOException: Checksum 
> verification failed
>                 e.printStackTrace();
>             }
>         };
>         IntStream.range(0, 30).forEach(i -> new Thread(runnable).start());
>     }
> }
> {code}
> Below is the output I receive on version 1.26: 
>  
> {code:java}
> java.io.IOException: Checksum verification failed
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.verify(ChecksumVerifyingInputStream.java:98)
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:92)
>   at org.apache.commons.io.IOUtils.skip(IOUtils.java:2422)
>   at org.apache.commons.io.IOUtils.skip(IOUtils.java:2380)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.getCurrentStream(SevenZFile.java:912)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.getInputStream(SevenZFile.java:988)
>   at 
> com.infotel.arcsys.nativ.archiving.zip.TestZip.lambda$main$0(TestZip.java:21)
>   at java.base/java.lang.Thread.run(Thread.java:833)
>  
> {code}
> The issue seems to arise from the transition from version 1.25 to 1.26 of 
> Apache Commons Compress. In the {{SevenZFile}} class of the library, the 
> private method {{getCurrentStream}} has migrated from 
> {{IOUtils.skip(InputStream, long)}} to a method with a same signature but in 
> Commons-IO package, which leads to a change in behavior. In version 1.26, it 
> uses a shared and unsynchronized buffer, theoretically intended only for 
> writing ({{{}SCRATCH_BYTE_BUFFER_WO{}}}). This causes checksum verification 
> issues within the library. The problem seems to be resolved by specifying the 
> {{Supplier}} of the buffer to use.
> {code:java}
> try (InputStream stream = deferredBlockStreams.remove(0)) {
>     org.apache.commons.io.IOUtils.skip(stream, Long.MAX_VALUE, () -> new 
> byte[org.apache.commons.io.IOUtils.DEFAULT_BUFFER_SIZE]);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (COMPRESS-679) Regression on parallel processing of 7zip files

Reply via email to