Strings, after construction, are immutable but may be constructed from mutable 
arrays of bytes, characters, or integers.
The string constructors should guard against the effects of mutating the arrays 
during construction that might invalidate internal invariants for the correct 
behavior of operations on the resulting strings. In particular, a number of 
operations have optimizations for operations on pairs of latin1 strings and 
pairs of non-latin1 strings, while operations between latin1 and non-latin1 
strings use a more general implementation. 

The changes include:

- Adding a warning to each constructor with an array as an argument to indicate 
that the results are indeterminate 
  if the input array is modified before the constructor returns. 
  The resulting string may contain any combination of characters sampled from 
the input array.

- Ensure that strings that are represented as non-latin1 contain at least one 
non-latin1 character.
  For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or 
another encoding decoded to latin1 the scanning and compression is unchanged.
  If a non-latin1 character is found, the string is represented as non-latin1 
with the added verification that a non-latin1 character is present at the same 
index.
  If that character is found to be latin1, then the input array has been 
modified and the result of the scan may be incorrect.
  Though a ConcurrentModificationException could be thrown, the risk to an 
existing application of an unexpected exception should be avoided.
  Instead, the non-latin1 copy of the input is re-scanned and compressed; that 
scan determines whether the latin1 or the non-latin1 representation is returned.

- The methods that scan for non-latin1 characters and their intrinsic 
implementations are updated to return the index of the non-latin1 character.

- String construction from StringBuilder and CharSequence must also be guarded 
as their contents may be modified during construction.

-------------

Commit messages:
 - Cleanup javadoc, whitespace, and formatting in the JMH benchmark
 - Update RiscV implementation of intrinsic for java.lang.StringUTF16.compress
 - Javadoc formatting
 - 8311906: Improve robustness of String constructors with mutable array 
arguments

Changes: https://git.openjdk.org/jdk/pull/16425/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16425&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8311906
  Stats: 1057 lines in 11 files changed: 859 ins; 82 del; 116 mod
  Patch: https://git.openjdk.org/jdk/pull/16425.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16425/head:pull/16425

PR: https://git.openjdk.org/jdk/pull/16425

Reply via email to