On Wed, 11 Mar 2026 12:01:01 GMT, Shaojin Wen <[email protected]> wrote:

> The encodedLengthUTF8() method uses an int accumulator (dp) for the LATIN1 
> code path, while the UTF16 path (encodedLengthUTF8_UTF16) correctly uses a 
> long accumulator with an overflow check. When a LATIN1 string contains more 
> than Integer.MAX_VALUE/2 non-ASCII bytes, the int dp overflows, potentially 
> causing NegativeArraySizeException in downstream buffer allocation.
> 
> Fix: change dp from int to long and add the same overflow check used in the 
> UTF16 path.

src/java.base/share/classes/java/lang/String.java line 1519:

> 1517:             throw new OutOfMemoryError("Required length exceeds 
> implementation limit");
> 1518:         }
> 1519:         return (int) dp;

I think you can leave the code as it currently is and throw when `dp < 0`.
But this variant only works when `dp` is incremented by at most 2 at each 
iteration, like here.
Your variant with `long` is more robust.

test/jdk/java/lang/String/EncodedLengthUTF8Overflow.java line 111:

> 109:         }
> 110:         bigArray = null; // allow GC
> 111: 

Have you considered simplifying the above code with just `bigString = 
String.valueOf(\u00ff).repeat(length)`?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/30189#discussion_r2948846882
PR Review Comment: https://git.openjdk.org/jdk/pull/30189#discussion_r2948920395

Reply via email to