| Issue |
86873
|
| Summary |
Failure to convert branchy code to branchless
|
| Labels |
missed-optimization
|
| Assignees |
|
| Reporter |
Kmeakin
|
https://godbolt.org/z/cn3d5fGs7
Consider these 3 identical functions for computing the length of a UTF8 codepoint from the leading byte:
```rust
#[no_mangle]
fn len_utf8_match(c: u8) -> usize {
match c {
0x00..=0x7F => 1,
0xC0..=0xDF => 2,
0xE0..=0xEF => 3,
_ => 4,
}
}
#[no_mangle]
fn len_utf8_branchless(c: u8) -> usize {
let mut ret = 1;
if (c & 0b1100_0000) == 0b1100_0000 {
ret = 2;
}
if (c & 0b1110_0000) == 0b1110_0000 {
ret = 3;
}
if (c & 0b1111_0000) == 0b1111_0000 {
ret = 4;
}
ret
}
#[no_mangle]
fn len_utf8_branchy(c: u8) -> usize {
if (c & 0b1111_0000) == 0b1111_0000 {
return 4;
}
if (c & 0b1110_0000) == 0b1110_0000 {
return 3;
}
if (c & 0b1100_0000) == 0b1100_0000 {
return 2;
}
1
}
```
For aarch64, `len_utf8_branchless` is the clear winner, for x86_64 and RISCV-64, I think the best results are from `len_utf8_branchless` and `len_utf8_branchy`.
In any case, `len_utf8_branchless` and `len_utf8_branchy` are equivalent, so identical assembly should be produced for both
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs