[jira] [Work logged] (COMPRESS-485) Reproducible Builds: keep entries order when gathering ScatterZipOutputStream content in ParallelScatterZipCreator

ASF GitHub Bot (JIRA) Sat, 11 May 2019 13:43:11 -0700


     [ 
https://issues.apache.org/jira/browse/COMPRESS-485?focusedWorklogId=240671&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-240671
 ]


ASF GitHub Bot logged work on COMPRESS-485:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/May/19 20:42
            Start Date: 11/May/19 20:42
    Worklog Time Spent: 10m 
      Work Description: Tibor17 commented on pull request #78: COMPRESS-485 
keep zip entries order in parallel zip creation
URL: https://github.com/apache/commons-compress/pull/78#discussion_r283110877
 
 

 ##########
 File path: 
src/main/java/org/apache/commons/compress/archivers/zip/ParallelScatterZipCreator.java
 ##########
 @@ -255,8 +257,13 @@ public void writeTo(final ZipArchiveOutputStream 
targetStream)
             compressionDoneAt = System.currentTimeMillis();
 
             synchronized (streams) {
+                // write zip entries in the order they were added (kept as 
futures)
+                for (final Future<ScatterZipOutputStream> future : futures) {
 
 Review comment:
   This loop was added. And it is a problem in this synchronized block.
   The monitor lock is `streams` and not `futures`. They cannot be both in one 
statement.
   Problem: this will cause NPE when the CPU is overloaded. The same issue was 
fixed in Surefire (see memory model).
   Additionally the `futures.add()` is not synchronized!
   One way would be for you to introduce double synchro block.
   I would not do it! Instead I would avoid synchronizations of `ArrayList` and 
on both. Then I would use modern collections which do not need external 
synchronizatrions (i.e. `Collections.synch...`)  and this way there would not 
be any `synchronized`. Use `ConcurrentLinkedDequeue`. It is useful when you add 
to the tail of the queue and do not remove, and if you iterate, because this 
implementation safely iterates even if you change it : no issue.  
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 240671)
    Time Spent: 1h 20m  (was: 1h 10m)

> Reproducible Builds: keep entries order when gathering ScatterZipOutputStream 
> content in ParallelScatterZipCreator
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: COMPRESS-485
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-485
>             Project: Commons Compress
>          Issue Type: Improvement
>          Components: Archivers
>    Affects Versions: 1.18
>            Reporter: Hervé Boutemy
>            Priority: Major
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> currently, zip files created using ParallelScatterZipCreator have random 
> order.
> This is causing issues when trying to do Reproducible Builds with Maven 
> MNG-6276
> Studying ParallelScatterZipCreator, entries are kept sorted in memory in 
> futures list: instead of writing each full scatter in sequence, iterating 
> over futures should permit to write each zip entry in original order, without 
> changing the API or any performance of the gathering process



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (COMPRESS-485) Reproducible Builds: keep entries order when gathering ScatterZipOutputStream content in ParallelScatterZipCreator

Reply via email to