Very good, thank you! Also note that the "new" implementation also requires *a lot* more heap, since all *uncompressed* file content is copied to the heap before deflating.
Best regards, /Lennart Börjeson > 15 apr. 2019 kl. 18:34 skrev Claes Redestad <claes.redes...@oracle.com>: > > Hi Lennart, > > I can reproduce this locally, and have narrowed down to > https://bugs.openjdk.java.net/browse/JDK-8034802 as the cause. > > As you say the compression is deferred to ZipFileSystem.close() now, > whereas previously it happened eagerly. We will have to analyze the > changes more in-depth to try and see why this is the case. > > Thanks! > > /Claes > > On 2019-04-15 15:32, Lennart Börjeson wrote: >> I have made a small command-line utility which creates zip archives by >> compressing the input files in parallel. >> I do this by calling Files.copy(input, zipOutputStream) in a parallel Stream >> over all input files. >> I have run this with Java 1.8, 9, 10, and 11, on both my local laptop and on >> server-class machines with up to 40 cores. >> It scales rather well and I get the expected speedup, i.e. roughly >> proportional to the number of cores. >> With OpenJDK 12, however, I get no speedup whatsoever. My understanding is >> that in JDK 12, the copy method simply >> transfers the contents of the input stream to a ByteArrayOutputStream, which >> is then deflated when I close the ZipFileSystem (by the ZipFileSystem.sync() >> method). >> Previously, the deflation was done when in the call to Files.copy, thus >> executed in parallel, and the final ZipFileSystem.close() didn't do anything >> much. >> (But I may of course be wrong. In case there's a simpler explanation and >> something I can remedy in my code, please let me know.) >> I have a small GitHub gist for the utility here: >> https://gist.github.com/lenborje/6d2f92430abe4ba881e3c5ff83736923 >> <https://gist.github.com/lenborje/6d2f92430abe4ba881e3c5ff83736923> >> Steps to reproduce (timings on my 8-core MacBook Pro): >> 1) Get the contents of the gist as a single file, Zip.java >> 2) Compile it using Java 8. >> $ export JAVA_HOME=<PATH-TO-JDK8-HOME> >> $ javac -encoding utf8 Zip.java >> 3) run on a sufficiently large number of files to exercise the parallelity: >> (I've used about 70 text files ca 60MB each) >> $ time java -Xmx6g Zip -p /tmp/test.zip <WHATEVER>/*.log. >> Working on ZIP FileSystem jar:file:/tmp/test.zip, using the options >> [PARALLEL] >> ... >> completed zip archive, now closing... done! >> real 0m35.558s >> user 3m58.134s >> sys 0m5.543s >> As is evident from the ratio between "user time" and "real time", all cores >> have been busy most of the time. >> (Running with JDK 9, 10, and 11 produces similar timings.) >> But running with JDK 12 defeats the parallelism: >> $ export JAVA_HOME=<PATH-TO-JDK12-HOME> >> $ rm /tmp/test.zip # From previous run >> $ time java -Xmx6g Zip -p /tmp/test.zip <WHATEVER>/*.log. >> Working on ZIP FileSystem jar:file:/tmp/test.zip, using the options >> [PARALLEL] >> ... >> completed zip archive, now closing... done! >> real 3m1.187s >> user 3m5.422s >> sys 0m12.396s >> Now there's almost no speedup. When observing the output, note that the >> ZipFileSystem.close() method is called immediately after the "now >> closing..." output, and "Done!" Is written when it returns, and when running >> with JDK 12 almost all running time is apparently spent there. >> I'm hoping the previous behaviour could somehow be restored, i.e. that >> deflation actually happens when I'm copying the input files to the >> ZipFileSystem, and not when I close it. >> Best regards, >> /Lennart Börjeson >>> 12 apr. 2019 kl. 14:25 skrev Lennart Börjeson <lenbo...@gmail.com>: >>> >>> I've found what I believe is a rather severe performance regression in >>> ZipFileSystem. 1.8 and 11 runs OK, 12 does not. >>> >>> Is this the right forum to report such issues? >>> >>> Best regards, >>> >>> /Lennart Börjeson