| Issue |
161746
|
| Summary |
Implement `clz` using floating point math if `clz` instruction is not available
|
| Labels |
missed-optimization
|
| Assignees |
|
| Reporter |
Kmeakin
|
In https://graphics.stanford.edu/~seander/bithacks.html#IntegerLogIEEE64Float we can find a neat trick to calculate the ilog2 of a u32 with floating point math:
```c++
u32 ilog2(u32 x) {
// zexto extend to 64 bits
u64 x_u64 = x;
// set exponent to 52
x_u64 |= (52 + 1023ULL) << 52;
// move into f64 register
double f = __builtin_bit_cast(double, x_u64);
// subtract 2^52
f -= 0x1.0p52;
// move back into u64 register
x_u64 = __builtin_bit_cast(u64, f);
// extract exponent
return (x_u64 >> 52) - 1023;
}
```
Rearrange the equation:
```
ilog2(x) == 31 - clz(x)
clz(x) + ilog2(x) == 31
clz(x) == 31 - ilog2(x)
```
and we get:
```c++
u32 src32(u32 x) { return 31 - __builtin_clzg(x); }
u32 tgt32(u32 x) { return 31 - ilog2_u32(x); }
```
This generalizes to u8 and u16 using float instead of double
https://godbolt.org/z/cW8js3MK9
https://alive2.llvm.org/ce/z/-bK_nC
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs