On Fri, 21 Nov 2025 10:47:42 GMT, Liam Miller-Cushon <[email protected]> wrote:

>> This PR proposes adding a new overload to `MemorySegment::getString` that 
>> takes a known byte length of the content.
>> 
>> This was previously proposed in https://github.com/openjdk/jdk/pull/20725, 
>> but the outcome of 
>> [JDK-8333843](https://bugs.openjdk.org/browse/JDK-8333843) was to update 
>> `MemorySegment#getString` to suggest
>> 
>> 
>>     byte[] bytes = new byte[length];
>>     MemorySegment.copy(segment, JAVA_BYTE, offset, bytes, 0, length);
>>     return new String(bytes, charset);
>> 
>> 
>> However this is less efficient than what the implementation of getString 
>> does after [JDK-8362893](https://bugs.openjdk.org/browse/JDK-8362893), it 
>> now uses `JavaLangAccess::uncheckedNewStringNoRepl` to avoid the copy.
>> 
>> See also discussion in [this panama-dev@ 
>> thread](https://mail.openjdk.org/pipermail/panama-dev/2025-November/021193.html),
>>  and mcimadamore's document [Pulling the (foreign) 
>> string](https://cr.openjdk.org/~mcimadamore/panama/strings_ffm.html)
>> 
>> Benchmark results:
>> 
>> 
>> Benchmark                                 (size)  Mode  Cnt    Score   Error 
>>  Units
>> ToJavaStringTest.jni_readString                5  avgt   30   55.339 ± 0.401 
>>  ns/op
>> ToJavaStringTest.jni_readString               20  avgt   30   59.887 ± 0.295 
>>  ns/op
>> ToJavaStringTest.jni_readString              100  avgt   30   84.288 ± 0.419 
>>  ns/op
>> ToJavaStringTest.jni_readString              200  avgt   30  119.275 ± 0.496 
>>  ns/op
>> ToJavaStringTest.jni_readString              451  avgt   30  193.106 ± 1.528 
>>  ns/op
>> ToJavaStringTest.panama_copyLength             5  avgt   30    7.348 ± 0.048 
>>  ns/op
>> ToJavaStringTest.panama_copyLength            20  avgt   30    7.440 ± 0.125 
>>  ns/op
>> ToJavaStringTest.panama_copyLength           100  avgt   30   11.766 ± 0.058 
>>  ns/op
>> ToJavaStringTest.panama_copyLength           200  avgt   30   16.096 ± 0.089 
>>  ns/op
>> ToJavaStringTest.panama_copyLength           451  avgt   30   25.844 ± 0.054 
>>  ns/op
>> ToJavaStringTest.panama_readString             5  avgt   30    5.857 ± 0.046 
>>  ns/op
>> ToJavaStringTest.panama_readString            20  avgt   30    7.750 ± 0.046 
>>  ns/op
>> ToJavaStringTest.panama_readString           100  avgt   30   14.109 ± 0.187 
>>  ns/op
>> ToJavaStringTest.panama_readString           200  avgt   30   18.035 ± 0.130 
>>  ns/op
>> ToJavaStringTest.panama_readString           451  avgt   30   35.896 ± 0.227 
>>  ns/op
>> ToJavaStringTest.panama_readStringLength       5  avgt   30    4.565 ± 0.038 
>>  ns/op
>> ToJavaStringTest.panama_readStringLength      20...
>
> Liam Miller-Cushon has updated the pull request incrementally with two 
> additional commits since the last revision:
> 
>  - Improve test coverage, and more fixes
>  - Review feedback
>    
>    * document assertion to link to bytesCompatible
>    * throw IAE for length > Integer.MAX_VALUE
>    * javadoc fixes

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 2660:

> 2658:      *         are {@code < 0}
> 2659:      * @throws IndexOutOfBoundsException if the {@code numChars + 
> srcIndex} is larger than the length of
> 2660:      *         this {@code String} object.

I think this leaves the question: what happens is `srcIndex + numChars` 
overflows and becomes negative? It will be less than the length of the string 
for sure, but not right either.

This is why the other method's javadoc write down e.g. `srcIndex > 
srcArray.length - elementCount`. Assuming all positive numbers, it avoids the 
overflow issue.

test/jdk/java/foreign/TestStringEncoding.java line 108:

> 106:     @Test(dataProvider = "strings")
> 107:     public void testStringsLength(String testString) {
> 108:         Set<String> excluded = Set.of("yen");

I know the yen character is translated to `/` in some encodings when doing a 
round trip. Maybe we should just avoid this issue by switching it out for e.g. 
`"section \u00A7"`, which is `§` and doesn't have the same problem.

test/jdk/java/foreign/TestStringEncoding.java line 180:

> 178:             assertThrows(IndexOutOfBoundsException.class, () ->
> 179:                     MemorySegment.copy(testString, 
> StandardCharsets.UTF_8, 0, text, 0, testString.length() + 1));
> 180:             // dstOffset > byteSize() + B

Suggestion:

            // dstOffset > byteSize() - B

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549653853
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549626703
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549635887

Reply via email to