On Sun, 8 Mar 2026 14:54:28 GMT, Tatsunori Uchino <[email protected]> wrote:
>> I don't think so. Your suggestion doesn't simplify the code:
>>
>>
>> boolean precededByHigh = false;
>> for (int i = 0; i < lastIndex; i++) {
>> char c = charAt(i);
>> if (precededByHigh) {
>> if (Character.isLowSurrogate(c)) {
>> n--;
>> }
>> precededByHigh = false;
>> } else {
>> precededByHigh = Character.isHighSurrogate(c);
>> }
>> }
>
> vs:
>
>
> for (int i = 0; i < lastIndex;) {
> if (Character.isHighSurrogate(charAt(i++))) {
> if (i >= lastIndex) break;
> if (Character.isLowSurrogate(charAt(i))) {
> n--;
> i++;
> }
> }
> }
>
>
> - No `else`.
> - No state variables.
> - Branch prediction for the second and third `if` statements will succeed
> 100% of the time for well-formed code unit sequences (normal strings).
Does the suggested code have a bug? I think the code returns 2 for
"\ud800\udc00" The loop breaks before the last low surrogate.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/26461#discussion_r3423966822