On Thu, 21 Jul 2022 17:45:39 GMT, Mark Reinhold <m...@openjdk.org> wrote:

>> Volker Simonis has refreshed the contents of this pull request, and previous 
>> commits have been removed. Incremental views are not available.
>
> First, I’ll observe that there is no specification conflict here.  The 
> specifications of `InflaterInputStream::read(byte[] b, int off, int len)` and 
> `Inflater::inflate(byte[] b, int off, int len)` are not in conflict at all.  
> The latter is simply weaker than the former.
> 
> What is true is that if you build the JDK in the default manner, without any 
> patches, then what you wind up with is an `InflaterInputStream::read(...)` 
> implementation that satisfies the inherited `InputStream::read(...)` 
> specification, including the constraint that a read operation must not 
> _scribble_ in the caller’s output buffer, _i.e._, modify any byte in that 
> buffer beyond the returned byte count.  (That constraint is, so far as we 
> know, satisfied by every `InputStream` subclass defined in the JDK.)
> 
> A default JDK build’s `InflaterInputStream::read(...)` method doesn’t 
> scribble because
> 
>   - The implementation of `InflaterInputStream::read(...)` assumes that the 
> implementation of `Inflater::inflate(...)` doesn’t scribble, and
> 
>   - The implementation of `Inflater::inflate(...)` satisfies that assumption 
> because it invokes the corresponding function in the [standard native Zip 
> implementation](https://zlib.net/), which itself doesn’t scribble.
> 
> So, not only does the specification forbid `InflaterInputStream::read(...)` 
> from scribbling, but in the official Java SE Reference Implementation — which 
> is simply a default JDK build — that method does not scribble.  (Perhaps by 
> accident, but that is immaterial here.)
> 
> Therefore, if in your own JDK build you substitute an alternative Zip 
> implementation whose `inflate` function does scribble then you wind up with a 
> JDK that both violates the specification and behaves differently than the RI. 
>  It is incompatible.  If you use such a JDK in production then you risk 
> breaking existing code.
> 
> Now, compatibility is one of the key values of the Java Platform, but 
> performance is not unimportant.  Is there something we can do to accommodate 
> [alternative, faster, scribbling Zip implementations][faster-zlib]?
> 
> @simonis lists three options in the [issue].  I’ll cast the options somewhat 
> differently.
> 
>   1. Do not change the specification in any way.  The specification text that 
> forbids scribbling dates back to Java 1.1, released over 25 years ago.  
> (Arguably that text over-specifies the `InputStream::read(...)` method, but 
> that is also immaterial.)  It’s all too possible that existing code — no 
> matter how odd or poorly written we think it might be — depends upon this 
> constraint.
> 
>      With this option, a JDK build that uses a scribbling Zip implementation 
> would have to buffer that implementation’s output and copy only the returned 
> number of bytes into the caller’s buffer.  This would degrade any performance 
> benefit.
> 
>   2. Weaken the specification of `InflaterInputStream` — and, thereby, that 
> of its subclasses `GZipInputStream` and `ZipInputStream` — to allow 
> scribbling.  This is essentially @simonis’s initial suggestion, as embodied 
> in [63ae357] earlier in this PR.
> 
>      This option risks breaking existing code that depends upon the 
> 25-year-old promise that `InflaterInputStream::read(...)` will not scribble.  
> Changing the specification of an indirect subclass of `InputStream`, 
> moreover, is apt to go unnoticed by many developers, who routinely deal with 
> instances of `InputStream` without any knowledge of those instances’ concrete 
> classes.  (@jaikiran gives one example above.)
> 
>   3. Weaken the specification of the `InputStream::read(...)` method to allow 
> scribbling, as [suggested by @jddarcy in the CSR][jddarcy].
> 
>      This option is at least as risky to existing code as the second option, 
> though it’s marginally more likely that developers writing new code will 
> notice the specification change.
> 
>      What’s worse, however, is that this option gives developers writing new 
> subclasses of `InputStream` permission to deviate from both that class’s 
> longstanding specification and the actual behavior of every `InputStream` 
> subclass defined in the JDK.  In the long term, that could wreak subtle bugs 
> throughout the ecosystem, even breaking code that does absolutely nothing 
> with Zip-related input streams.
> 
> In favor of both the second and third options is @simonis’s statement above 
> that his employer (Amazon) has used JDK builds with a scribbling Zip 
> implementation in production for two years.  He has further elaborated 
> (privately) that this includes hundreds of services built with many of the 
> usual popular frameworks and libraries (Spring, Hibernate, ASM, etc.).  This 
> suggests that the practical impact, out in the wild, of allowing scribbling 
> Zip implementations is acceptable.
> 
> On that basis I am — just barely — comfortable with the second option, 
> _i.e._, weakening the specification of `InflaterInputStream` to allow 
> scribbling, thereby contradicting the specification of its `InputStream` 
> superclass.
> 
> Explicitly specifying a subclass in a way that’s inconsistent with the 
> specification of one of its superclasses is generally a really bad idea.  It 
> violates the [Liskov Substitution Principle][lsp], which enables instances of 
> a subclass to be substituted freely for instances of one of its superclasses. 
>  Thus we do not condone such changes lightly.  There are, however, precedents 
> in a handful of special cases in the Java Platform; _e.g._, in the 
> [`java.util.IdentityHashMap`][ihm] and [`java.sql.Timestamp`][ts] classes.
> 
> The third option above would not violate the Liskov substitution principle, 
> but in my view could prove more harmful in the long run.
> 
> Recommendations:
> 
>   - Revert this PR back to [63ae357], and make corresponding updates to the 
> [issue] and the [CSR].
> 
>   - Additionally, find every method in every `java.*` class that’s declared 
> to return an `InputStream` (_e.g._, `ZipFile::getInputStream(...)` but 
> actually returns an instance of `InflaterInputStream` or one of its 
> subclasses.  Add an API note to each such method to let developers know that 
> the returned input stream might scribble, per @jaikiran’s suggestion above.  
> These notes can all use the same text, linking to the revised 
> `InflaterInputStream::read(...)` specification.
> 
>   - Revise the summary of this PR, the issue, and the CSR to reflect the true 
> nature of the proposed change rather than assert a nonexistent conflict.  
> Consider “Weaken the `InflaterInputStream` specification in order to allow 
> faster Zip implementations”.
> 
>   - Change the issue type from a bug — which it’s not — to an enhancement.
> 
>   - Create a related release-note issue, as @RogerRiggs suggests above.
> 
> 
> [faster-zlib]: 
> https://github.com/simonis/zlib-chromium#inflater-improvements-in-zlib-chromium
> [issue]: https://bugs.openjdk.org/browse/JDK-8282648
> [63ae357]: 
> https://github.com/openjdk/jdk/pull/7986/commits/63ae3572da674a0926bb9d0adcea88d89bb56707
> [jddarcy]: 
> https://bugs.openjdk.org/browse/JDK-8283758?focusedCommentId=14492930&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492930
> [CSR]: https://bugs.openjdk.org/browse/JDK-8283758
> [lsp]: https://en.wikipedia.org/wiki/Liskov_substitution_principle
> [ihm]: 
> https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/IdentityHashMap.html
> [ts]: 
> https://docs.oracle.com/en/java/javase/18/docs/api/java.sql/java/sql/Timestamp.html

@mbreinhold, @LanceAndersen thanks for your reviews. I'm waiting for the CSR 
approval before pushing.

-------------

PR: https://git.openjdk.org/jdk/pull/7986

Reply via email to