Re: RFR: 8364317: Explicitly document some assumptions of StringUTF16 [v2]

Chen Liang Thu, 31 Jul 2025 10:22:40 -0700

On Wed, 30 Jul 2025 18:12:03 GMT, Volkan Yazici <vyaz...@openjdk.org> wrote:


>> Chen Liang has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Add paragraph for endianness and layout
>
> src/java.base/share/classes/java/lang/StringUTF16.java line 51:
> 
>> 49: ///
>> 50: /// All indices and sizes for byte arrays carrying UTF16 data are in 
>> number of
>> 51: /// chars instead of  number of bytes.
> 
> Nit on cosmetics:
> 
> Suggestion:
> 
> /// UTF-16 `String` operations.
> ///
> /// UTF-16 byte arrays have the identical layout as `char` arrays. They share 
> the
> /// same base offset and scale, and for each two-byte unit interpreted as a 
> `char`,
> /// it has the same endianness as a `char`, which is the platform endianness.
> /// This is ensured in the static initializer of [StringUTF16].
> ///
> /// All indices and sizes for byte arrays carrying UTF-16 data are in number 
> of
> /// `char`s instead of number of bytes.

Unforutnately I don't think I will use your suggestion maybe besides that 
whitespace fix.

1. UTF16 is derived from the `String.UTF16`, so I don't think I will stylize 
that.
2. The number of chars is the number of characters. I double checked and seems 
`length` is the only one that returns the byte array length instead of using 
character count as unit. There are some LATIN1 accepting APIs, and those APIs 
also use number of characters, except in LATIN1 case the number is identical to 
the number of chars.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26541#discussion_r2245962980

Re: RFR: 8364317: Explicitly document some assumptions of StringUTF16 [v2]

Reply via email to