Hi Volker, since it's not yet pushed, I also went over the change and I like it. +1 😊
One little style nit caught my eye, which you could correct before pushing: The style of the if/else blocks in test/jdk/java/util/zip/DeflateIn_InflateOut.java in lines 48/49 and 59/60 does not match the other if/else blocks in the file. You should probably have the else on the line of the closing bracket before... Thanks Christoph > -----Original Message----- > From: core-libs-dev <[email protected]> On Behalf > Of Volker Simonis > Sent: Mittwoch, 22. April 2020 22:09 > To: Lance Andersen <[email protected]> > Cc: Java Core Libs <[email protected]>; Vyom Tewari > <[email protected]> > Subject: Re: RFR(S): 8242848: Improve performance of > InflaterOutputStream.write() > > On Tue, Apr 21, 2020 at 5:23 PM Lance Andersen > <[email protected]> wrote: > > > > Hi Volker, > > > > I think overall this looks OK. I went through the older SCCS histories to > > see > if I could figure out why they were using 512 for the input length but could > not find anything that might shed some light for me. > > > > Hi Lance, > > thanks a lot for digging in the old sources to review my change. It's > great that we stil have people who can use SCCS :) > > > I am not sure you can guarantee that src.zip exists but others might be able > to weigh in here. What we have been trying to do going forward is to have > the tests create the zip files that it needs. In some cases, we have > created a > byte array within the test which represents the zip and just write it out > before the test begins. > > > > Yes, the dependency on an external file was not nice, so I rewrote the > benchmark to programmatically create a file which can be compressed by > a factor of ~6: > > http://cr.openjdk.java.net/~simonis/webrevs/2020/8242848.02/ > > Notice that this new version only changes the microbenchmark, all the > other files are untouched. > > As everybody seemed to be happy with the change itself and the > regression test, I'm now waiting for your and Clae's final review of > the microbenchmark before pushing. Please let me know hat you think? > > Best regards, > Volker > > > I am hoping others with more history might also chime in case they are > aware of the history here. > > > > Thank you for helping improve the performance. > > > > Best > > Lance > > > > On Apr 17, 2020, at 6:49 AM, Volker Simonis <[email protected]> > wrote: > > > > Thanks everybody for looking at this change! > > > > Please find an updated webrev (with a new test and micro-benchmark) > > and my answers to your comments below: > > > > http://cr.openjdk.java.net/~simonis/webrevs/2020/8242848.01/ > > > > On Thu, Apr 16, 2020 at 6:40 AM Vyom Tiwari <[email protected]> > wrote: > > > > Thanks for doing this, i think the below code change is not required . > > > > Please do let me know if i am not reading it correctly. > > > > if (inf.finished() || (len == 0)/* no more input */) { > > > > If you check the native code "inf.finished() will be true only of the > corresponding native call inflate(strm, Z_PARTIAL_FLUSH) returns > "Z_STREAM_END". > > > > After your code change write may return even if finished() is not true. > > > > > > Yes, that's true, but that's what we must do if there's no more input > > available. Before my change this break on "len < 1" was in the "if > > (inf.needsInput())" branch. > > > > On Thu, Apr 16, 2020 at 8:22 AM Thomas Stüfe > <[email protected]> wrote: > > > > 252 // Check the decompressor > > 253 if (inf.finished() || (len == 0)/* no more input */) { > > 254 break; > > 255 } > > > > Not sure but I think this is wrong because now you bypass the followup > handling of inf.needsDirectory. > > > > Inflater.inflate() returns 0 if either needsInput or needsDirectory. We have > to ignore needsInput since we have no input anymore, but needsDirectory > has to be handled, no? > > > > > > You're absolutely right Thomas. Thanks for catching this! I've moved > > the check for "needsDictionary" in front of the "finished() || len == > > 0" check. > > > > Unfortunately there is not very good test coverage for zip with preset > > dictionaries (jtreg and submit repo passed without problems). So I > > added a new test for this use case to " > > test/jdk/java/util/zip/DeflateIn_InflateOut.java". > > > > On Thu, Apr 16, 2020 at 8:48 AM Thomas Stüfe > <[email protected]> wrote: > > > > > > As for increasing the buffer size, it makes sense but IMHO needs a CSR and > a release note. > > > > > > I don't think so. This is an internal, implementation specific setting > > which has never been exposed or documented before so I don't see why > > we should document it now. But let's discuss this separately in the > > corresponding JBS issue (see below). > > > > On Thu, Apr 16, 2020 at 1:18 PM Claes Redestad > > <[email protected]> wrote: > > > > > > Hi Volker, > > > > On 2020-04-15 19:48, Volker Simonis wrote: > > > > While doing this change, I've also realized that all the streams in > > java.util.zip (i.e. DeflaterInputStream, GZIPInputStream, > > GZIPOutputStream, InflaterInputStream, DeflaterOutputStream) use an > > internal byte buffer of 512 bytes by default. Looking at the benchmark > > attached to JDK-8242864, I think that increasing this default to a > > bigger size (e.g. 4096 bytes) will considerably speed up (up to 50%) > > read and write operations on these streams when they are created with > > the default buffer size. I think the value "512" is a legacy of old > > times when memory was more precious:) so I've opened "JDK-8242864: > > Increase the default, internal buffer size of the Streams in > > java.util.zip" to track that as as separate issue. Do you think it > > makes sense to increase that default value? > > > > > > Seems reasonable. 8192 seems to be the buffer size we've been > converging > > on elsewhere (see InputStream, BufferedInputStream, Files, ..). I also > > > > > > That seems reasonable. Alan commented on the JBS issue so we can > > continue the discussion there. > > > > found an instance of 8096, which is either a typo or a clever > > optimization to keep the array + array object header fit snugly within > > 8Kb. You chose. :-) > > > > > > I like how you try to be positive :) > > > > > > Thank you and best regards, > > Volker > > > > PS: do you think it makes sense to contribute the benchmark attached > > to JDK-8242864 to the code-tools/mh-jdk-microbenchmarks [1] project? > > > > [1]http://openjdk.java.net/projects/code-tools/jmh-jdk- > microbenchmarks/ > > > > > > I'd definitely welcome the micro as part of the patch under > > test/micro/org/openjdk/bench/java/util/zip - additionally contributing > > > > > > I knew that "jmh-jdk-microbenchmarks" has been copied to the jdk repo > > but somehow I did found "jmh-jdk-microbenchmarks" before looking in > > the obvious place :) > > > > So I've added the benchmark to the patch now. There's one thing I'm > > not sure however. The benchmark requires a "big" (several 100k) with > > good compression ratio (e.g. a large text file). I've decided to use a > > big Java source file from "src.zip" but I'm not sure if "src.zip" is > > always available in the jdk images which are used to run the > > microbenchmarks. Do you think the test it is fine this way or do you > > have a better idea? I saw that "ZipFind" uses "microbenchmarks.jar" > > (i.e. the container of the test itself) but that file is already > > compressed so the compression rate won't be that good. > > > > Another thing I couldn't figure out is a good way to skip the > > benchmark when I realize that I can't load the expected file in the > > "@Setup" method. Do you now anything better than just throwing an > > exception? > > > > Thank you and best regards, > > Volker > > > > to jmh-jdk-microbenchmarks could enable you to test the micro on 8 or > > 11. > > > > /Claes > > > > > > > > Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037 > > Oracle Java Engineering > > 1 Network Drive > > Burlington, MA 01803 > > [email protected] > > > > > >
