On Wed, 11 Mar 2026 12:01:01 GMT, Shaojin Wen <[email protected]> wrote:
> The encodedLengthUTF8() method uses an int accumulator (dp) for the LATIN1
> code path, while the UTF16 path (encodedLengthUTF8_UTF16) correctly uses a
> long accumulator with an overflow check. When a LATIN1 string contains more
> than Integer.MAX_VALUE/2 non-ASCII bytes, the int dp overflows, potentially
> causing NegativeArraySizeException in downstream buffer allocation.
>
> Fix: change dp from int to long and add the same overflow check used in the
> UTF16 path.
src/java.base/share/classes/java/lang/String.java line 1519:
> 1517: throw new OutOfMemoryError("Required length exceeds
> implementation limit");
> 1518: }
> 1519: return (int) dp;
I think you can leave the code as it currently is and throw when `dp < 0`.
But this variant only works when `dp` is incremented by at most 2 at each
iteration, like here.
Your variant with `long` is more robust.
test/jdk/java/lang/String/EncodedLengthUTF8Overflow.java line 111:
> 109: }
> 110: bigArray = null; // allow GC
> 111:
Have you considered simplifying the above code with just `bigString =
String.valueOf(\u00ff).repeat(length)`?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/30189#discussion_r2948846882
PR Review Comment: https://git.openjdk.org/jdk/pull/30189#discussion_r2948920395