https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118380
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2025-01-09
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, LLVM likely unrolls all loops while we don't, so constant propagation
from the initializer doesn't work. With --param
max-completely-peeled-insns=1000 we produce
test256:
.LFB7779:
.cfi_startproc
vmovss .LC0(%rip), %xmm0
ret
which is better than clang which fails to eliminate an empty loop.
I think this works as intended (limiting code growth and compile-time,
heuristically - obviously not realizing the full followup optimization).
The __builtin_ia32_vbroadcastss256 call is of course a blocker, confirmed
for that part.