https://bugs.llvm.org/show_bug.cgi?id=50189
Bug ID: 50189
Summary: Missed optimization to auto-vectorize
__builtin_bitreverse
Product: clang
Version: unspecified
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: C++
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected], [email protected],
[email protected], [email protected],
[email protected]
This code:
#include <cstdint>
#define DEFINE_BITREVERSE_LOOP(N) \
void bitreverse##N##_loop(int##N##_t* values, int n) { \
for (int i = 0; i < n; i++) { \
values[i] = __builtin_bitreverse##N(values[i]); \
} \
} \
DEFINE_BITREVERSE_LOOP(8);
DEFINE_BITREVERSE_LOOP(16);
DEFINE_BITREVERSE_LOOP(32);
DEFINE_BITREVERSE_LOOP(64);
When compiled with -O3 -march=armv8.2-a+fp16+rcpc+dotprod+crypto+simd+crc
Both clang-9 and clang-11 produce all kinds of code (sometimes even
auto-vectorized, for bytes), instead of just passing vector registers to the
rbit instruction.
This seems a bit ironic, given that __builtin_bitreverse was originally
invented to support arm rbit.
--
You are receiving this mail because:
You are on the CC list for the bug._______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs