Add an API note to `InflaterInputStream::read(byte[] b, int off, int len)` to 
highlight that it might  write more bytes than the returned number of inflated 
bytes into the buffer `b`.

The superclass `java.io.InputStream` specifies that `read(byte[] b, int off, 
int len)` will leave the content beyond the last read byte in the read buffer 
`b` unaffected. However, the overridden `read` method in `InflaterInputStream` 
passes the read buffer `b` to `Inflater::inflate(byte[] b, int off, int len)` 
which doesn't provide this guarantee. Depending on implementation details, 
`Inflater::inflate` might write more than the returned number of inflated bytes 
into the buffer `b`.

### TL;DR

`java.util.zip.Inflater` is the Java wrapper class for zlib's inflater 
functionality. `Inflater::inflate(byte[] output, int off, int len)` currently 
calls zlib's native `inflate(..)` function and passes the address of 
`output[off]` and `len` to it via JNI.

The specification of zlib's `inflate(..)` function (i.e. the [API documentation 
in the original zlib 
implementation](https://github.com/madler/zlib/blob/cacf7f1d4e3d44d871b605da3b647f07d718623f/zlib.h#L400))
 doesn't give any guarantees with regard to usage of the output buffer. It only 
states that upon completion the function will return the number of bytes that 
have been written (i.e. "inflated") into the output buffer.

The original zlib implementation only wrote as many bytes into the output 
buffer as it inflated. However, this is not a hard requirement and newer, more 
performant implementations of the zlib library like 
[zlib-chromium](https://chromium.googlesource.com/chromium/src/third_party/zlib/)
 or [zlib-cloudflare](https://github.com/cloudflare/zlib) can use more bytes of 
the output buffer than they actually inflate as a scratch buffer. See 
https://github.com/simonis/zlib-chromium for a more detailed description of 
their approach and its performance benefit.

These new zlib versions can still be used transparently from Java (e.g. by 
putting them into the `LD_LIBRARY_PATH` or by using `LD_PRELOAD`), because they 
still fully comply to specification of `Inflater::inflate(..)`. However, we 
might run into problems when using the `Inflater` functionality from the 
`InflaterInputStream` class. `InflaterInputStream` is derived from from 
`InputStream` and as such, its `read(byte[] b, int off, int len)` method is 
quite constrained. It specifically specifies that if *k* bytes have been read, 
then "these bytes will be stored in elements `b[off]` through `b[off+`*k*`-1]`, 
leaving elements `b[off+`*k*`]` through `b[off+len-1]` **unaffected**". But 
`InflaterInputStream::read(byte[] b, int off, int len)` (which is constrained 
by `InputStream::read(..)`'s specification) calls `Inflater::inflate(byte[] b, 
int off, int len)` and directly passes its output buffer down to the native 
zlib `inflate(..)` method which is free to change the bytes beyond `b[off+`*k*
 `]` (where *k* is the number of inflated bytes).

>From a practical point of view, I don't see this as a big problem, because 
>callers of `InflaterInputStream::read(byte[] b, int off, int len)` can never 
>know how many bytes will be written into the output buffer `b` (and in fact 
>its content can always be completely overwritten). It therefore makes no sense 
>to depend on any data there being untouched after the call. Also, having used 
>zlib-cloudflare productively for about two years, we haven't seen real-world 
>issues because of this behavior yet. However, from a specification point of 
>view it is easy to artificially construct a program which violates 
>`InflaterInputStream::read(..)`'s postcondition if using one of the 
>alterantive zlib implementations. A recently integrated JTreg test 
>(test/jdk/jdk/nio/zipfs/ZipFSOutputStreamTest.java) "unintentionally" fails 
>with zlib-chromium but can fixed easily to run with alternative 
>implementations as well (see 
>[JDK-8283756](https://bugs.openjdk.java.net/browse/JDK-8283756)).

-------------

Commit messages:
 - 8283758: Problems due to conflicting specification of Inflater::inflate(..) 
and InflaterInputStream::read(..)

Changes: https://git.openjdk.java.net/jdk/pull/7986/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7986&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8283758
  Stats: 6 lines in 1 file changed: 5 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7986.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7986/head:pull/7986

PR: https://git.openjdk.java.net/jdk/pull/7986

Reply via email to