Re: RFR: 8372353: API to compute the byte length of a String encoded in a given Charset [v5]

Volkan Yazici Thu, 15 Jan 2026 09:37:42 -0800

On Thu, 15 Jan 2026 17:02:01 GMT, Liam Miller-Cushon <[email protected]> wrote:


>> src/java.base/share/classes/java/lang/String.java line 1585:
>> 
>>> 1583: 
>>> 1584:     // This follows the implementation of encodeUTF8_UTF16
>>> 1585:     private static int encodedLengthUTF8_UTF16(byte[] val) {
>> 
>> Doesn't this duplicate the `computeSizeUTF8_UTF16`?
>> 
>> AFAICS, `computeSizeUTF8_UTF16` is missing the ASCII fast loop, but we can 
>> enhance it.
>> 
>> FWIW, if we decide reuse `computeSizeUTF8_UTF16`, it might be better to 
>> rename it to `encodedLengthUTF8_UTF16`, which will be in line with the 
>> introduced `encodedLength*` method family.
>
> Thanks for the catch, good point I will look at switching to 
> `computeSizeUTF8_UTF16`.
> 
> `computeSizeUTF8_UTF16` returns `long`, this raises a question of what to do 
> in that case. The return type of `getBytesLength` could potentially be `long` 
> and allow computing the encoded length of strings that wouldn't fit into an 
> array if they were encoded. Or it could throw an exception in that case, 
> similar to `getBytes`, and have an `int` return type

`computeSizeUTF8_UTF16` is only used in `encodeUTF8_UTF16`:


long allocLen = (sl * 3 < 0) ? computeSizeUTF8_UTF16(val, exClass) : sl * 3;
if (allocLen > (long)Integer.MAX_VALUE) {
    throw new OutOfMemoryError("Required length exceeds implementation limit");
}


I guess we can move `if (allocLen > (long)Integer.MAX_VALUE)` check to 
`computeSizeUTF8_UTF16` and make its return type `int`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2695315669

Re: RFR: 8372353: API to compute the byte length of a String encoded in a given Charset [v5]

Reply via email to