On Sun, 8 Mar 2026 14:54:28 GMT, Tatsunori Uchino <[email protected]> wrote:

>> I don't think so. Your suggestion doesn't simplify the code:
>> 
>> 
>> boolean precededByHigh = false;
>> for (int i = 0; i < lastIndex; i++) {
>>     char c = charAt(i);
>>     if (precededByHigh) {
>>         if (Character.isLowSurrogate(c)) {
>>             n--;
>>         }
>>         precededByHigh = false;
>>     } else {
>>         precededByHigh = Character.isHighSurrogate(c);
>>     }
>> }
>
> vs:
> 
> 
> for (int i = 0; i < lastIndex;) {
>     if (Character.isHighSurrogate(charAt(i++))) {
>         if (i >= lastIndex) break;
>         if (Character.isLowSurrogate(charAt(i))) {
>             n--;
>             i++;
>         }
>     }
> }
> 
> 
> - No `else`.
> - No state variables.
> - Branch prediction for the second and third `if` statements will succeed 
> 100% of the time for well-formed code unit sequences (normal strings).

Does the suggested code have a bug? I think the code returns 2 for 
"\ud800\udc00" The loop breaks before the last low surrogate.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26461#discussion_r3423966822

Reply via email to