On Thu, 15 Jan 2026 17:02:01 GMT, Liam Miller-Cushon <[email protected]> wrote:
>> src/java.base/share/classes/java/lang/String.java line 1585:
>>
>>> 1583:
>>> 1584: // This follows the implementation of encodeUTF8_UTF16
>>> 1585: private static int encodedLengthUTF8_UTF16(byte[] val) {
>>
>> Doesn't this duplicate the `computeSizeUTF8_UTF16`?
>>
>> AFAICS, `computeSizeUTF8_UTF16` is missing the ASCII fast loop, but we can
>> enhance it.
>>
>> FWIW, if we decide reuse `computeSizeUTF8_UTF16`, it might be better to
>> rename it to `encodedLengthUTF8_UTF16`, which will be in line with the
>> introduced `encodedLength*` method family.
>
> Thanks for the catch, good point I will look at switching to
> `computeSizeUTF8_UTF16`.
>
> `computeSizeUTF8_UTF16` returns `long`, this raises a question of what to do
> in that case. The return type of `getBytesLength` could potentially be `long`
> and allow computing the encoded length of strings that wouldn't fit into an
> array if they were encoded. Or it could throw an exception in that case,
> similar to `getBytes`, and have an `int` return type
`computeSizeUTF8_UTF16` is only used in `encodeUTF8_UTF16`:
long allocLen = (sl * 3 < 0) ? computeSizeUTF8_UTF16(val, exClass) : sl * 3;
if (allocLen > (long)Integer.MAX_VALUE) {
throw new OutOfMemoryError("Required length exceeds implementation limit");
}
I guess we can move `if (allocLen > (long)Integer.MAX_VALUE)` check to
`computeSizeUTF8_UTF16` and make its return type `int`.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2695315669