Hi Volker > On Apr 22, 2020, at 4:08 PM, Volker Simonis <volker.simo...@gmail.com> wrote: > > On Tue, Apr 21, 2020 at 5:23 PM Lance Andersen > <lance.ander...@oracle.com> wrote: >> >> Hi Volker, >> >> I think overall this looks OK. I went through the older SCCS histories to >> see if I could figure out why they were using 512 for the input length but >> could not find anything that might shed some light for me. >> > > Hi Lance, > > thanks a lot for digging in the old sources to review my change. It's > great that we stil have people who can use SCCS :)
:-) > >> I am not sure you can guarantee that src.zip exists but others might be able >> to weigh in here. What we have been trying to do going forward is to have >> the tests create the zip files that it needs. In some cases, we have >> created a byte array within the test which represents the zip and just write >> it out before the test begins. >> > > Yes, the dependency on an external file was not nice, so I rewrote the > benchmark to programmatically create a file which can be compressed by > a factor of ~6: > > http://cr.openjdk.java.net/~simonis/webrevs/2020/8242848.02/ > > Notice that this new version only changes the microbenchmark, all the > other files are untouched. > > As everybody seemed to be happy with the change itself and the > regression test, I'm now waiting for your and Clae's final review of > the microbenchmark before pushing. Please let me know hat you think? I think you are good to go from my POV. Thank you for your contribution to improving Zip performance. Best, Lance > > Best regards, > Volker > >> I am hoping others with more history might also chime in case they are aware >> of the history here. >> >> Thank you for helping improve the performance. >> >> Best >> Lance >> >> On Apr 17, 2020, at 6:49 AM, Volker Simonis <volker.simo...@gmail.com> wrote: >> >> Thanks everybody for looking at this change! >> >> Please find an updated webrev (with a new test and micro-benchmark) >> and my answers to your comments below: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2020/8242848.01/ >> >> On Thu, Apr 16, 2020 at 6:40 AM Vyom Tiwari <vyomm...@gmail.com> wrote: >> >> Thanks for doing this, i think the below code change is not required . >> >> Please do let me know if i am not reading it correctly. >> >> if (inf.finished() || (len == 0)/* no more input */) { >> >> If you check the native code "inf.finished() will be true only of the >> corresponding native call inflate(strm, Z_PARTIAL_FLUSH) returns >> "Z_STREAM_END". >> >> After your code change write may return even if finished() is not true. >> >> >> Yes, that's true, but that's what we must do if there's no more input >> available. Before my change this break on "len < 1" was in the "if >> (inf.needsInput())" branch. >> >> On Thu, Apr 16, 2020 at 8:22 AM Thomas Stüfe <thomas.stu...@gmail.com> wrote: >> >> 252 // Check the decompressor >> 253 if (inf.finished() || (len == 0)/* no more input */) { >> 254 break; >> 255 } >> >> Not sure but I think this is wrong because now you bypass the followup >> handling of inf.needsDirectory. >> >> Inflater.inflate() returns 0 if either needsInput or needsDirectory. We have >> to ignore needsInput since we have no input anymore, but needsDirectory has >> to be handled, no? >> >> >> You're absolutely right Thomas. Thanks for catching this! I've moved >> the check for "needsDictionary" in front of the "finished() || len == >> 0" check. >> >> Unfortunately there is not very good test coverage for zip with preset >> dictionaries (jtreg and submit repo passed without problems). So I >> added a new test for this use case to " >> test/jdk/java/util/zip/DeflateIn_InflateOut.java". >> >> On Thu, Apr 16, 2020 at 8:48 AM Thomas Stüfe <thomas.stu...@gmail.com> wrote: >> >> >> As for increasing the buffer size, it makes sense but IMHO needs a CSR and a >> release note. >> >> >> I don't think so. This is an internal, implementation specific setting >> which has never been exposed or documented before so I don't see why >> we should document it now. But let's discuss this separately in the >> corresponding JBS issue (see below). >> >> On Thu, Apr 16, 2020 at 1:18 PM Claes Redestad >> <claes.redes...@oracle.com> wrote: >> >> >> Hi Volker, >> >> On 2020-04-15 19:48, Volker Simonis wrote: >> >> While doing this change, I've also realized that all the streams in >> java.util.zip (i.e. DeflaterInputStream, GZIPInputStream, >> GZIPOutputStream, InflaterInputStream, DeflaterOutputStream) use an >> internal byte buffer of 512 bytes by default. Looking at the benchmark >> attached to JDK-8242864, I think that increasing this default to a >> bigger size (e.g. 4096 bytes) will considerably speed up (up to 50%) >> read and write operations on these streams when they are created with >> the default buffer size. I think the value "512" is a legacy of old >> times when memory was more precious:) so I've opened "JDK-8242864: >> Increase the default, internal buffer size of the Streams in >> java.util.zip" to track that as as separate issue. Do you think it >> makes sense to increase that default value? >> >> >> Seems reasonable. 8192 seems to be the buffer size we've been converging >> on elsewhere (see InputStream, BufferedInputStream, Files, ..). I also >> >> >> That seems reasonable. Alan commented on the JBS issue so we can >> continue the discussion there. >> >> found an instance of 8096, which is either a typo or a clever >> optimization to keep the array + array object header fit snugly within >> 8Kb. You chose. :-) >> >> >> I like how you try to be positive :) >> >> >> Thank you and best regards, >> Volker >> >> PS: do you think it makes sense to contribute the benchmark attached >> to JDK-8242864 to the code-tools/mh-jdk-microbenchmarks [1] project? >> >> [1]http://openjdk.java.net/projects/code-tools/jmh-jdk-microbenchmarks/ >> >> >> I'd definitely welcome the micro as part of the patch under >> test/micro/org/openjdk/bench/java/util/zip - additionally contributing >> >> >> I knew that "jmh-jdk-microbenchmarks" has been copied to the jdk repo >> but somehow I did found "jmh-jdk-microbenchmarks" before looking in >> the obvious place :) >> >> So I've added the benchmark to the patch now. There's one thing I'm >> not sure however. The benchmark requires a "big" (several 100k) with >> good compression ratio (e.g. a large text file). I've decided to use a >> big Java source file from "src.zip" but I'm not sure if "src.zip" is >> always available in the jdk images which are used to run the >> microbenchmarks. Do you think the test it is fine this way or do you >> have a better idea? I saw that "ZipFind" uses "microbenchmarks.jar" >> (i.e. the container of the test itself) but that file is already >> compressed so the compression rate won't be that good. >> >> Another thing I couldn't figure out is a good way to skip the >> benchmark when I realize that I can't load the expected file in the >> "@Setup" method. Do you now anything better than just throwing an >> exception? >> >> Thank you and best regards, >> Volker >> >> to jmh-jdk-microbenchmarks could enable you to test the micro on 8 or >> 11. >> >> /Claes >> >> >> >> Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037 >> Oracle Java Engineering >> 1 Network Drive >> Burlington, MA 01803 >> lance.ander...@oracle.com >> >> >> <http://oracle.com/us/design/oracle-email-sig-198324.gif> <http://oracle.com/us/design/oracle-email-sig-198324.gif> <http://oracle.com/us/design/oracle-email-sig-198324.gif> <http://oracle.com/us/design/oracle-email-sig-198324.gif>Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037 Oracle Java Engineering 1 Network Drive Burlington, MA 01803 lance.ander...@oracle.com <mailto:lance.ander...@oracle.com>