On Thu, 15 Jan 2026 16:17:22 GMT, Volkan Yazici <[email protected]> wrote:
>> Liam Miller-Cushon has updated the pull request incrementally with one
>> additional commit since the last revision:
>>
>> Update tests
>
> src/java.base/share/classes/java/lang/String.java line 2143:
>
>> 2141: */
>> 2142: public int getBytesLength(Charset cs) {
>> 2143: if (cs == UTF_8.INSTANCE) {
>
> It'd be nice to catch null values as early as possible. I suggest adding a
> `Objects.requireNonNull(cs)` along with `@throws NullPointerException If
> {@code cs} is null` in docs.
I added the `requireNonNull`, omitting the `@throws` as suggested in
https://github.com/openjdk/jdk/pull/28454/changes#r2695394410
> src/java.base/share/classes/java/lang/String.java line 2148:
>
>> 2146: if (isLatin1()) {
>> 2147: return value.length;
>> 2148: }
>
> Any particular reason you avoided introducing a `encodedLength8859_1` here?
> (There is a `encode8859_1` method.)
I have tentatively added `encodedLength8859_1`
`encode8859_1` is implemented in terms of the `implEncodeISOArray`, so it is
less similar than the other examples. In general I figured there was a tradeoff
between the performance benefit and the additional code to have fast paths for
each charset, and UTF-8 may be more frequently used.
> src/java.base/share/classes/java/lang/String.java line 2151:
>
>> 2149: } else if (cs == US_ASCII.INSTANCE) {
>> 2150: return encodedLengthASCII(coder, value);
>> 2151: } else if (cs instanceof sun.nio.cs.UTF_16LE || cs instanceof
>> sun.nio.cs.UTF_16BE) {
>
> I see that `sun.nio.cs.UTF_16{LE,BE}` specialization is suggested by
> @ExE-Boss [here]. Though I'm not really sure if this is really needed. I
> cannot spot any other usage of these constants in `java.base`, except
> `jdk.internal.foreign.StringSupport`, which is irrelevant.
>
> [here]: https://github.com/openjdk/jdk/pull/28454/files#r2552768341
I don't have a strong opinion about these charsets. It's nice that the encoded
length for them can be calculated in constant time, but on the other hand if
they are less frequently used and there isn't precedent for special casing them
in `java.base`, then this part could be dropped.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2695394410
PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2695468134
PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2695471999