On Wed, 19 Nov 2025 14:52:04 GMT, Maurizio Cimadamore <[email protected]>
wrote:
>> Liam Miller-Cushon has updated the pull request incrementally with one
>> additional commit since the last revision:
>>
>> Add a dstOffset parameter, stop using
>> StringCharBuffer/CharsetEncoder::encode
>
> src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 2649:
>
>> 2647: *
>> 2648: * @param src the Java string to be written into this segment
>> 2649: * @param dstEncoding the charset used to {@linkplain
>> Charset#newEncoder() encode}
>
> I'm not sure we have a dependency on the charset being standard?
We do not, thanks, fixed.
Although I think the existing `allocateFrom(String, Charset)` method does have
an undocumented dependency, because it uses `CharsetKind` to get the terminator
char length, which only supports standard Charsets. If we add a fast path for
UTF-16 that may need a dependency on a standard Charset (or a standard way to
get the code unit size of a charset, if it has one).
> src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java line 194:
>
>> 192: MemorySegment segment;
>> 193: if (StringSupport.bytesCompatible(str, charset, srcIndex,
>> numChars)) {
>> 194: segment = allocateNoInit(numChars);
>
> This also seems to rely on the fact that we end up here only for latin1
> strings. Again, I don't think this is correct, but if it's deliberate, we
> should add an assertion check.
Good point. I think we make a similar assumption in the existing
`allocateFrom(String, Charset)`, it does `length + termCharSize` and that
should perhaps be `(length + 1) * codeUnitSize`.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2544835957
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2544862879