Thank you Volker for sharing your results. It is amazing, to improve such widely used feature (zip -> jar, http compression...)
Congratulations. Laurent Le jeu. 23 avr. 2020 à 12:19, Volker Simonis <[email protected]> a écrit : > On Thu, Apr 23, 2020 at 10:56 AM Laurent Bourgès > <[email protected]> wrote: > > > > Hi Volker, > > > > Could you give some benchmark results in the jbs bug to have an idea of > the performance gain ? > > The results of a benchmark run are at the end of the microbenchmark > which is attached to https://bugs.openjdk.java.net/browse/JDK-8242848 > > I've attached them here for your convenience. The first run is before, > the second run after applying this patch. > > E.g. for a user supplied buffer of size 16384, the time for inflation > decreases from "2.603 ± 0.404 ms/op" to "2.187 ± 0.126 ms/op" which > is an improvement of 16%. Obviously, increasing the default internal > buffer size from 512 to 16384 (which I want to do in > https://bugs.openjdk.java.net/browse/JDK-8242864) will get you even > bigger speedups if you use the default buffer size :) > > /output/jdk-opt/images/jdk/bin/java > ---------------------------------------------------------------------- > Benchmark (scale) (size) Mode Cnt Score > Error Units > InflaterOutputStreamWrite.write 1 512 avgt 3 5.378 ± > 0.301 ms/op > InflaterOutputStreamWrite.write 1 1024 avgt 3 3.917 ± > 0.165 ms/op > InflaterOutputStreamWrite.write 1 2048 avgt 3 3.158 ± > 0.097 ms/op > InflaterOutputStreamWrite.write 1 4096 avgt 3 2.707 ± > 0.138 ms/op > InflaterOutputStreamWrite.write 1 8192 avgt 3 2.600 ± > 0.399 ms/op > InflaterOutputStreamWrite.write 1 16384 avgt 3 2.603 ± > 0.404 ms/op > InflaterOutputStreamWrite.write 1 32768 avgt 3 2.622 ± > 0.211 ms/op > InflaterOutputStreamWrite.write 1 65536 avgt 3 2.605 ± > 0.170 ms/op > InflaterOutputStreamWrite.write 2 512 avgt 3 4.015 ± > 0.150 ms/op > InflaterOutputStreamWrite.write 2 1024 avgt 3 3.225 ± > 0.178 ms/op > InflaterOutputStreamWrite.write 2 2048 avgt 3 2.745 ± > 0.261 ms/op > InflaterOutputStreamWrite.write 2 4096 avgt 3 2.614 ± > 0.542 ms/op > InflaterOutputStreamWrite.write 2 8192 avgt 3 2.593 ± > 0.206 ms/op > InflaterOutputStreamWrite.write 2 16384 avgt 3 2.606 ± > 0.055 ms/op > InflaterOutputStreamWrite.write 2 32768 avgt 3 2.611 ± > 0.116 ms/op > InflaterOutputStreamWrite.write 2 65536 avgt 3 2.617 ± > 0.170 ms/op > InflaterOutputStreamWrite.write 4 512 avgt 3 3.376 ± > 0.599 ms/op > InflaterOutputStreamWrite.write 4 1024 avgt 3 2.840 ± > 0.155 ms/op > InflaterOutputStreamWrite.write 4 2048 avgt 3 2.633 ± > 0.550 ms/op > InflaterOutputStreamWrite.write 4 4096 avgt 3 2.598 ± > 0.166 ms/op > InflaterOutputStreamWrite.write 4 8192 avgt 3 2.602 ± > 0.054 ms/op > InflaterOutputStreamWrite.write 4 16384 avgt 3 2.601 ± > 0.039 ms/op > InflaterOutputStreamWrite.write 4 32768 avgt 3 2.639 ± > 0.020 ms/op > InflaterOutputStreamWrite.write 4 65536 avgt 3 2.619 ± > 0.260 ms/op > InflaterOutputStreamWrite.write 8 512 avgt 3 2.882 ± > 0.149 ms/op > InflaterOutputStreamWrite.write 8 1024 avgt 3 2.695 ± > 0.586 ms/op > InflaterOutputStreamWrite.write 8 2048 avgt 3 2.644 ± > 0.472 ms/op > InflaterOutputStreamWrite.write 8 4096 avgt 3 2.616 ± > 0.052 ms/op > InflaterOutputStreamWrite.write 8 8192 avgt 3 2.616 ± > 0.063 ms/op > InflaterOutputStreamWrite.write 8 16384 avgt 3 2.611 ± > 0.090 ms/op > InflaterOutputStreamWrite.write 8 32768 avgt 3 2.633 ± > 0.216 ms/op > InflaterOutputStreamWrite.write 8 65536 avgt 3 2.634 ± > 0.180 ms/op > > /priv/simonisv/output/jdk-opt-JDK-8242848/images/jdk/bin/java > ---------------------------------------------------------------------- > Benchmark (scale) (size) Mode Cnt Score > Error Units > InflaterOutputStreamWrite.write 1 512 avgt 3 5.388 ± > 0.349 ms/op > InflaterOutputStreamWrite.write 1 1024 avgt 3 3.799 ± > 0.093 ms/op > InflaterOutputStreamWrite.write 1 2048 avgt 3 2.994 ± > 0.023 ms/op > InflaterOutputStreamWrite.write 1 4096 avgt 3 2.583 ± > 0.159 ms/op > InflaterOutputStreamWrite.write 1 8192 avgt 3 2.325 ± > 0.345 ms/op > InflaterOutputStreamWrite.write 1 16384 avgt 3 2.187 ± > 0.126 ms/op > InflaterOutputStreamWrite.write 1 32768 avgt 3 2.073 ± > 0.083 ms/op > InflaterOutputStreamWrite.write 1 65536 avgt 3 2.007 ± > 0.153 ms/op > InflaterOutputStreamWrite.write 2 512 avgt 3 3.996 ± > 0.037 ms/op > InflaterOutputStreamWrite.write 2 1024 avgt 3 3.089 ± > 0.023 ms/op > InflaterOutputStreamWrite.write 2 2048 avgt 3 2.628 ± > 0.073 ms/op > InflaterOutputStreamWrite.write 2 4096 avgt 3 2.356 ± > 0.344 ms/op > InflaterOutputStreamWrite.write 2 8192 avgt 3 2.202 ± > 0.055 ms/op > InflaterOutputStreamWrite.write 2 16384 avgt 3 2.081 ± > 0.033 ms/op > InflaterOutputStreamWrite.write 2 32768 avgt 3 2.015 ± > 0.169 ms/op > InflaterOutputStreamWrite.write 2 65536 avgt 3 1.985 ± > 0.196 ms/op > InflaterOutputStreamWrite.write 4 512 avgt 3 3.325 ± > 0.920 ms/op > InflaterOutputStreamWrite.write 4 1024 avgt 3 2.740 ± > 0.156 ms/op > InflaterOutputStreamWrite.write 4 2048 avgt 3 2.415 ± > 0.370 ms/op > InflaterOutputStreamWrite.write 4 4096 avgt 3 2.250 ± > 0.012 ms/op > InflaterOutputStreamWrite.write 4 8192 avgt 3 2.115 ± > 0.085 ms/op > InflaterOutputStreamWrite.write 4 16384 avgt 3 2.042 ± > 0.099 ms/op > InflaterOutputStreamWrite.write 4 32768 avgt 3 1.988 ± > 0.185 ms/op > InflaterOutputStreamWrite.write 4 65536 avgt 3 1.975 ± > 0.171 ms/op > InflaterOutputStreamWrite.write 8 512 avgt 3 2.870 ± > 0.035 ms/op > InflaterOutputStreamWrite.write 8 1024 avgt 3 2.495 ± > 0.334 ms/op > InflaterOutputStreamWrite.write 8 2048 avgt 3 2.280 ± > 0.056 ms/op > InflaterOutputStreamWrite.write 8 4096 avgt 3 2.155 ± > 0.073 ms/op > InflaterOutputStreamWrite.write 8 8192 avgt 3 2.046 ± > 0.079 ms/op > InflaterOutputStreamWrite.write 8 16384 avgt 3 1.995 ± > 0.098 ms/op > InflaterOutputStreamWrite.write 8 32768 avgt 3 1.979 ± > 0.119 ms/op > InflaterOutputStreamWrite.write 8 65536 avgt 3 1.986 ± > 0.155 ms/op > > > > > Thanks, > > Laurent > > > > Le jeu. 23 avr. 2020 à 10:20, Langer, Christoph < > [email protected]> a écrit : > >> > >> Hi Volker, > >> > >> since it's not yet pushed, I also went over the change and I like it. +1 > >> > >> One little style nit caught my eye, which you could correct before > pushing: The style of the if/else blocks in > test/jdk/java/util/zip/DeflateIn_InflateOut.java in lines 48/49 and 59/60 > does not match the other if/else blocks in the file. You should probably > have the else on the line of the closing bracket before... > >> > >> Thanks > >> Christoph > >> > >> > >> > -----Original Message----- > >> > From: core-libs-dev <[email protected]> On > Behalf > >> > Of Volker Simonis > >> > Sent: Mittwoch, 22. April 2020 22:09 > >> > To: Lance Andersen <[email protected]> > >> > Cc: Java Core Libs <[email protected]>; Vyom Tewari > >> > <[email protected]> > >> > Subject: Re: RFR(S): 8242848: Improve performance of > >> > InflaterOutputStream.write() > >> > > >> > On Tue, Apr 21, 2020 at 5:23 PM Lance Andersen > >> > <[email protected]> wrote: > >> > > > >> > > Hi Volker, > >> > > > >> > > I think overall this looks OK. I went through the older SCCS > histories to see > >> > if I could figure out why they were using 512 for the input length > but could > >> > not find anything that might shed some light for me. > >> > > > >> > > >> > Hi Lance, > >> > > >> > thanks a lot for digging in the old sources to review my change. It's > >> > great that we stil have people who can use SCCS :) > >> > > >> > > I am not sure you can guarantee that src.zip exists but others > might be able > >> > to weigh in here. What we have been trying to do going forward is to > have > >> > the tests create the zip files that it needs. In some cases, we > have created a > >> > byte array within the test which represents the zip and just write it > out > >> > before the test begins. > >> > > > >> > > >> > Yes, the dependency on an external file was not nice, so I rewrote the > >> > benchmark to programmatically create a file which can be compressed by > >> > a factor of ~6: > >> > > >> > http://cr.openjdk.java.net/~simonis/webrevs/2020/8242848.02/ > >> > > >> > Notice that this new version only changes the microbenchmark, all the > >> > other files are untouched. > >> > > >> > As everybody seemed to be happy with the change itself and the > >> > regression test, I'm now waiting for your and Clae's final review of > >> > the microbenchmark before pushing. Please let me know hat you think? > >> > > >> > Best regards, > >> > Volker > >> > > >> > > I am hoping others with more history might also chime in case they > are > >> > aware of the history here. > >> > > > >> > > Thank you for helping improve the performance. > >> > > > >> > > Best > >> > > Lance > >> > > > >> > > On Apr 17, 2020, at 6:49 AM, Volker Simonis < > [email protected]> > >> > wrote: > >> > > > >> > > Thanks everybody for looking at this change! > >> > > > >> > > Please find an updated webrev (with a new test and micro-benchmark) > >> > > and my answers to your comments below: > >> > > > >> > > http://cr.openjdk.java.net/~simonis/webrevs/2020/8242848.01/ > >> > > > >> > > On Thu, Apr 16, 2020 at 6:40 AM Vyom Tiwari <[email protected]> > >> > wrote: > >> > > > >> > > Thanks for doing this, i think the below code change is not > required . > >> > > > >> > > Please do let me know if i am not reading it correctly. > >> > > > >> > > if (inf.finished() || (len == 0)/* no more input */) { > >> > > > >> > > If you check the native code "inf.finished() will be true only of > the > >> > corresponding native call inflate(strm, Z_PARTIAL_FLUSH) returns > >> > "Z_STREAM_END". > >> > > > >> > > After your code change write may return even if finished() is not > true. > >> > > > >> > > > >> > > Yes, that's true, but that's what we must do if there's no more > input > >> > > available. Before my change this break on "len < 1" was in the "if > >> > > (inf.needsInput())" branch. > >> > > > >> > > On Thu, Apr 16, 2020 at 8:22 AM Thomas Stüfe > >> > <[email protected]> wrote: > >> > > > >> > > 252 // Check the decompressor > >> > > 253 if (inf.finished() || (len == 0)/* no more > input */) { > >> > > 254 break; > >> > > 255 } > >> > > > >> > > Not sure but I think this is wrong because now you bypass the > followup > >> > handling of inf.needsDirectory. > >> > > > >> > > Inflater.inflate() returns 0 if either needsInput or > needsDirectory. We have > >> > to ignore needsInput since we have no input anymore, but > needsDirectory > >> > has to be handled, no? > >> > > > >> > > > >> > > You're absolutely right Thomas. Thanks for catching this! I've moved > >> > > the check for "needsDictionary" in front of the "finished() || len > == > >> > > 0" check. > >> > > > >> > > Unfortunately there is not very good test coverage for zip with > preset > >> > > dictionaries (jtreg and submit repo passed without problems). So I > >> > > added a new test for this use case to " > >> > > test/jdk/java/util/zip/DeflateIn_InflateOut.java". > >> > > > >> > > On Thu, Apr 16, 2020 at 8:48 AM Thomas Stüfe > >> > <[email protected]> wrote: > >> > > > >> > > > >> > > As for increasing the buffer size, it makes sense but IMHO needs a > CSR and > >> > a release note. > >> > > > >> > > > >> > > I don't think so. This is an internal, implementation specific > setting > >> > > which has never been exposed or documented before so I don't see why > >> > > we should document it now. But let's discuss this separately in the > >> > > corresponding JBS issue (see below). > >> > > > >> > > On Thu, Apr 16, 2020 at 1:18 PM Claes Redestad > >> > > <[email protected]> wrote: > >> > > > >> > > > >> > > Hi Volker, > >> > > > >> > > On 2020-04-15 19:48, Volker Simonis wrote: > >> > > > >> > > While doing this change, I've also realized that all the streams in > >> > > java.util.zip (i.e. DeflaterInputStream, GZIPInputStream, > >> > > GZIPOutputStream, InflaterInputStream, DeflaterOutputStream) use an > >> > > internal byte buffer of 512 bytes by default. Looking at the > benchmark > >> > > attached to JDK-8242864, I think that increasing this default to a > >> > > bigger size (e.g. 4096 bytes) will considerably speed up (up to 50%) > >> > > read and write operations on these streams when they are created > with > >> > > the default buffer size. I think the value "512" is a legacy of old > >> > > times when memory was more precious:) so I've opened "JDK-8242864: > >> > > Increase the default, internal buffer size of the Streams in > >> > > java.util.zip" to track that as as separate issue. Do you think it > >> > > makes sense to increase that default value? > >> > > > >> > > > >> > > Seems reasonable. 8192 seems to be the buffer size we've been > >> > converging > >> > > on elsewhere (see InputStream, BufferedInputStream, Files, ..). I > also > >> > > > >> > > > >> > > That seems reasonable. Alan commented on the JBS issue so we can > >> > > continue the discussion there. > >> > > > >> > > found an instance of 8096, which is either a typo or a clever > >> > > optimization to keep the array + array object header fit snugly > within > >> > > 8Kb. You chose. :-) > >> > > > >> > > > >> > > I like how you try to be positive :) > >> > > > >> > > > >> > > Thank you and best regards, > >> > > Volker > >> > > > >> > > PS: do you think it makes sense to contribute the benchmark attached > >> > > to JDK-8242864 to the code-tools/mh-jdk-microbenchmarks [1] project? > >> > > > >> > > [1]http://openjdk.java.net/projects/code-tools/jmh-jdk- > >> > microbenchmarks/ > >> > > > >> > > > >> > > I'd definitely welcome the micro as part of the patch under > >> > > test/micro/org/openjdk/bench/java/util/zip - additionally > contributing > >> > > > >> > > > >> > > I knew that "jmh-jdk-microbenchmarks" has been copied to the jdk > repo > >> > > but somehow I did found "jmh-jdk-microbenchmarks" before looking in > >> > > the obvious place :) > >> > > > >> > > So I've added the benchmark to the patch now. There's one thing I'm > >> > > not sure however. The benchmark requires a "big" (several 100k) with > >> > > good compression ratio (e.g. a large text file). I've decided to > use a > >> > > big Java source file from "src.zip" but I'm not sure if "src.zip" is > >> > > always available in the jdk images which are used to run the > >> > > microbenchmarks. Do you think the test it is fine this way or do you > >> > > have a better idea? I saw that "ZipFind" uses "microbenchmarks.jar" > >> > > (i.e. the container of the test itself) but that file is already > >> > > compressed so the compression rate won't be that good. > >> > > > >> > > Another thing I couldn't figure out is a good way to skip the > >> > > benchmark when I realize that I can't load the expected file in the > >> > > "@Setup" method. Do you now anything better than just throwing an > >> > > exception? > >> > > > >> > > Thank you and best regards, > >> > > Volker > >> > > > >> > > to jmh-jdk-microbenchmarks could enable you to test the micro on 8 > or > >> > > 11. > >> > > > >> > > /Claes > >> > > > >> > > > >> > > > >> > > Lance Andersen| Principal Member of Technical Staff | > +1.781.442.2037 > >> > > Oracle Java Engineering > >> > > 1 Network Drive > >> > > Burlington, MA 01803 > >> > > [email protected] > >> > > > >> > > > >> > > >
