Issue 161746
Summary Implement `clz` using floating point math if `clz` instruction is not available
Labels missed-optimization
Assignees
Reporter Kmeakin
    In https://graphics.stanford.edu/~seander/bithacks.html#IntegerLogIEEE64Float we can find a neat trick to calculate the ilog2 of a u32 with floating point math:

```c++
u32 ilog2(u32 x) {
    // zexto extend to 64 bits
    u64 x_u64 = x;

    // set exponent to 52
    x_u64 |= (52 + 1023ULL) << 52;

    // move into f64 register
    double f = __builtin_bit_cast(double, x_u64);

    // subtract 2^52
    f -= 0x1.0p52;

    // move back into u64 register
    x_u64 = __builtin_bit_cast(u64, f);

    // extract exponent
    return (x_u64 >> 52) - 1023;
}
```

Rearrange the equation:
```
ilog2(x) == 31 - clz(x)
clz(x) + ilog2(x) == 31
clz(x) == 31 - ilog2(x)
```

and we get:
```c++
u32 src32(u32 x) { return 31 - __builtin_clzg(x); }
u32 tgt32(u32 x) { return 31 - ilog2_u32(x); }
```

This generalizes to u8 and u16 using float instead of double

https://godbolt.org/z/cW8js3MK9
https://alive2.llvm.org/ce/z/-bK_nC
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to