On Fri, 21 Nov 2025 14:58:55 GMT, Liam Miller-Cushon <[email protected]> wrote:

> This implements an API to return the byte length of a String encoded in a 
> given charset. See [JDK-8372353](https://bugs.openjdk.org/browse/JDK-8372353) 
> for background.
> 
> ---
> 
> 
> Benchmark                              (encoding)  (stringLength)   Mode  Cnt 
>          Score          Error  Units
> StringLoopJmhBenchmark.getBytes             ASCII              10  thrpt    5 
>  406782650.595 ± 16960032.852  ops/s
> StringLoopJmhBenchmark.getBytes             ASCII             100  thrpt    5 
>  172936926.189 ±  4532029.201  ops/s
> StringLoopJmhBenchmark.getBytes             ASCII            1000  thrpt    5 
>   38830681.232 ±  2413274.766  ops/s
> StringLoopJmhBenchmark.getBytes             ASCII          100000  thrpt    5 
>     458881.155 ±    12818.317  ops/s
> StringLoopJmhBenchmark.getBytes            LATIN1              10  thrpt    5 
>   37193762.990 ±  3962947.391  ops/s
> StringLoopJmhBenchmark.getBytes            LATIN1             100  thrpt    5 
>   55400876.236 ±  1267331.434  ops/s
> StringLoopJmhBenchmark.getBytes            LATIN1            1000  thrpt    5 
>   11104514.001 ±    41718.545  ops/s
> StringLoopJmhBenchmark.getBytes            LATIN1          100000  thrpt    5 
>     182535.414 ±    10296.120  ops/s
> StringLoopJmhBenchmark.getBytes             UTF16              10  thrpt    5 
>  113474681.457 ±  8326589.199  ops/s
> StringLoopJmhBenchmark.getBytes             UTF16             100  thrpt    5 
>   37854103.127 ±  4808526.773  ops/s
> StringLoopJmhBenchmark.getBytes             UTF16            1000  thrpt    5 
>    4139833.009 ±    70636.784  ops/s
> StringLoopJmhBenchmark.getBytes             UTF16          100000  thrpt    5 
>      57644.637 ±     1887.112  ops/s
> StringLoopJmhBenchmark.getBytesLength       ASCII              10  thrpt    5 
>  946701647.247 ± 76938927.141  ops/s
> StringLoopJmhBenchmark.getBytesLength       ASCII             100  thrpt    5 
>  396615374.479 ± 15167234.884  ops/s
> StringLoopJmhBenchmark.getBytesLength       ASCII            1000  thrpt    5 
>  100464784.979 ±   794027.897  ops/s
> StringLoopJmhBenchmark.getBytesLength       ASCII          100000  thrpt    5 
>    1215487.689 ±     1916.468  ops/s
> StringLoopJmhBenchmark.getBytesLength      LATIN1              10  thrpt    5 
>  221265102.323 ± 17013983.056  ops/s
> StringLoopJmhBenchmark.getBytesLength      LATIN1             100  thrpt    5 
>  137617873.887 ±  5842185.781  ops/s
> StringLoopJmhBenchmark.getBytesLength      LATIN1            1000  thrpt    5 
>   92540259.130 ±  3839233.582  ops/s
> StringLoopJmhBenchmark.ge...

The test has an odd mix of throwing Exception and RuntimeException.
It would be good to upgrade the test to use JUnit (though it could/should be a 
separate PR).

src/java.base/share/classes/java/lang/String.java line 2112:

> 2110:      *
> 2111:      * <p>The result will be the same value as {@code 
> getBytes(charset).length}.
> 2112:      *

An @implNote or @apiNote maybe useful to indicate that this may allocate memory 
to compute the length for some Charsets.

src/java.base/share/classes/java/lang/String.java line 2120:

> 2118:             return encodedLengthUTF8(coder, value);
> 2119:         }
> 2120:         if (bytesCompatible(cs, 0, value.length)) {

BytesCompatible gives a non-optimal answer for a US_ASCII input that has chars 
> 0x7f.

src/java.base/share/classes/java/lang/String.java line 2125:

> 2123:         if (cs instanceof sun.nio.cs.UTF_16LE ||
> 2124:             cs instanceof sun.nio.cs.UTF_16BE) {
> 2125:             return value.length << (1 - coder());

Please encapsulate this computation `byteFor(int length, coder) {...}` to make 
it easier to re-use and document.

-------------

PR Review: https://git.openjdk.org/jdk/pull/28454#pullrequestreview-3658097768
PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2688260162
PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2688257004
PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2688253744

Reply via email to