On Wed, 16 Feb 2022 09:30:46 GMT, Volker Simonis <simo...@openjdk.org> wrote:
> Currently, `InflaterInputStream::read()` first does a native call to the > underlying zlib `inflate()` function and only afterwards checks if the > inflater requires input (i.e. `Inflater::needsInput()`) or has finished > inflating (`Inflater::finished()`). This leads to an unnecessary native call > to `inflate()` when `InflaterInputStream::read()` is invoked for the very > first time because `Inflater::fill()` hasn't been called and another > unnecessary native call to detect the end of the stream. For small > streams/files which completely fit into the output buffer passed to > `InflaterInputStream::read()` we can therefore save two out of three native > calls. The attached micro benchmark shows that this results in a 5%-10% > performance improvement for zip files of sizes between 4096 to 256 bytes > (notice that in common jars like Tomcat, Spring-Boot, Maven, Jackson, etc. > about 60-80% of the classes are smaller than 4096 bytes). > > > before JDK-8281962 > ------------------ > Benchmark (size) Mode Cnt Score > Error Units > InflaterInputStreams.inflaterInputStreamRead 256 avgt 5 2.571 ± > 0.120 us/op > InflaterInputStreams.inflaterInputStreamRead 512 avgt 5 2.861 ± > 0.064 us/op > InflaterInputStreams.inflaterInputStreamRead 4096 avgt 5 5.110 ± > 0.278 us/op > > after JDK-8281962 > ----------------- > Benchmark (size) Mode Cnt Score > Error Units > InflaterInputStreams.inflaterInputStreamRead 256 avgt 5 2.332 ± > 0.081 us/op > InflaterInputStreams.inflaterInputStreamRead 512 avgt 5 2.691 ± > 0.293 us/op > InflaterInputStreams.inflaterInputStreamRead 4096 avgt 5 4.812 ± > 1.038 us/op > > > Tested with the JTreg zip/jar/zipfs and the JCK zip/jar tests. > > As a side effect, this change also fixes an issue with alternative zlib > implementations like zlib-Chromium or zlib-Cloudflare which pad the inflated > bytes with a specif byte pattern at the end of the output for debugging > purpose. This breaks code patterns like the following: > > > int readcount = 0; > while ((bytesRead = inflaterInputStream.read(data, 0, bufferSize)) != -1) { > outputStream.write(data, 0, bytesRead); > readCount++; > } > if (readCount == 1) { > return data; // <---- first bytes might be overwritten > } > return outputStream.toByteArray(); > > > Even if the whole data fits into the `data` array during the first call to > `inflaterInputStream.read()`, we still need a second call to > `inflaterInputStream.read()` in order to detect the end of the inflater > stream. If this second call calls into the native `inflate()` function of > Cloudflare/Chromium's zlib version, this will still write some padding bytes > at the beginning of the `data` array and overwrite the previously read data. > This issue has been reported in Spring [1] and ASM [2] when using these > libraries with Cloudflares zlib version (by setting `LD_LIBRARY_PATH` > accordingly). > > [1] https://github.com/spring-projects/spring-framework/issues/27429 > [2] https://gitlab.ow2.org/asm/asm/-/issues/317955 This change is probably okay but will require study to see if there are any side effects (sadly, this area has a history of side effects being reported months and years after a change). Would you mind holding off integrating this change until it has been reviewed by someone that works in the area? ------------- PR: https://git.openjdk.java.net/jdk/pull/7492