On Wed, 30 Jul 2025 18:12:03 GMT, Volkan Yazici <vyaz...@openjdk.org> wrote:
>> Chen Liang has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Add paragraph for endianness and layout > > src/java.base/share/classes/java/lang/StringUTF16.java line 51: > >> 49: /// >> 50: /// All indices and sizes for byte arrays carrying UTF16 data are in >> number of >> 51: /// chars instead of number of bytes. > > Nit on cosmetics: > > Suggestion: > > /// UTF-16 `String` operations. > /// > /// UTF-16 byte arrays have the identical layout as `char` arrays. They share > the > /// same base offset and scale, and for each two-byte unit interpreted as a > `char`, > /// it has the same endianness as a `char`, which is the platform endianness. > /// This is ensured in the static initializer of [StringUTF16]. > /// > /// All indices and sizes for byte arrays carrying UTF-16 data are in number > of > /// `char`s instead of number of bytes. Unforutnately I don't think I will use your suggestion maybe besides that whitespace fix. 1. UTF16 is derived from the `String.UTF16`, so I don't think I will stylize that. 2. The number of chars is the number of characters. I double checked and seems `length` is the only one that returns the byte array length instead of using character count as unit. There are some LATIN1 accepting APIs, and those APIs also use number of characters, except in LATIN1 case the number is identical to the number of chars. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26541#discussion_r2245962980