> Can I please get a review of this change which proposes to fix an issue 
> `java.util.zip.ZipFile` which would cause failures when multiple instances of 
> `ZipFile` using non-UTF8 `Charset` were operating against the same underlying 
> ZIP file? This addresses https://bugs.openjdk.org/browse/JDK-8347712.
> 
> ZIP file specification allows for ZIP entries to mark a `UTF-8` flag to 
> indicate that the entry name and comment are encoded using UTF8. A 
> `java.util.zip.ZipFile` can be constructed by passing it a `Charset`. This 
> `Charset` (which defaults to UTF-8) gets used for decoding entry names and 
> comments for non-UTF8 entries.
> 
> The internal implementation of `ZipFile` uses a `ZipCoder` (backed by 
> `java.nio.charset.CharsetEncoder/CharsetDecoder` instance) for the given 
> `Charset`. Except for UTF8 `ZipCoder`, other `ZipCoder`s are not thread safe.
> 
> The internal implementation of `ZipFile` maintains a cache of 
> `ZipFile$Source`. A `Source` corresponds to the underlying ZIP file and 
> during construction, uses a `ZipCoder` for parsing the ZIP entries and once 
> constructed holds on to the parsed ZIP structure. Multiple instances of a 
> `ZipFile` which all correspond to the same ZIP file on the filesystem, share 
> a single instance of `Source` (after the `Source` has been constructed and 
> cached). Although `ZipFile` instances aren't expected to be thread-safe, the 
> fact that multiple different instances of `ZipFile` could be sharing the same 
> instance of `Source` in concurrent threads, mandates that the `Source` must 
> be thread-safe.
> 
> In Java 15, we did a performance optimization through 
> https://bugs.openjdk.org/browse/JDK-8243469. As part of that change, we 
> started holding on to the `ZipCoder` instance (corresponding to the `Charset` 
> provided during `ZipFile` construction) in the `Source`. This stored 
> `ZipCoder` was then used for `ZipFile` operations when working with the ZIP 
> entries. As noted previously, any non-UTF8 `ZipCoder` is not thread-safe and 
> as a result, any usages of `ZipCoder` in the `Source` makes `Source` not 
> thread-safe too. That effectively violates the requirement that `Source` must 
> be thread-safe to allow for its usage in multiple different `ZipFile` 
> instances concurrently. This then causes `ZipFile` usages to fail in 
> unexpected ways like the one shown in the linked 
> https://bugs.openjdk.org/browse/JDK-8347712.
> 
> The commit in this PR addresses the issue by not maintaining `ZipCoder` as a 
> instance field of `Source`. Instead the `ZipCoder` is now maintained in the 
> `ZipFile`,...

Jaikiran Pai has updated the pull request with a new target base due to a merge 
or a rebase. The incremental webrev excludes the unrelated changes brought in 
by the merge/rebase. The pull request contains 21 additional commits since the 
last revision:

 - Eirik's review about code comments
 - fix comment typo
 - merge latest from master branch
 - 8355975: introduce a test for 8355975
 - merge latest from master branch
 - merge latest from master branch
 - merge latest from master branch
 - merge latest from master branch
 - merge latest from master branch
 - improve code comment for ZipFile.zipCoder
 - ... and 11 more: https://git.openjdk.org/jdk/compare/bfa4438c...9a29b960

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/23986/files
  - new: https://git.openjdk.org/jdk/pull/23986/files/4e3f8e70..9a29b960

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=23986&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23986&range=05-06

  Stats: 368006 lines in 4489 files changed: 111658 ins; 239302 del; 17046 mod
  Patch: https://git.openjdk.org/jdk/pull/23986.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/23986/head:pull/23986

PR: https://git.openjdk.org/jdk/pull/23986

Reply via email to