https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124097

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
But if the predicate is required anyway I would have expected the predicated
add to be faster.  That is, it's not the predicated add that is bad but the
predicate generation.  This would also depend on the target, so match.pd might
not be the best place to perform this "optimization".

On Zen with AVX512 the compare to %k register also has comparatively high
latency (it's slower than the AVX2 compare to %xmm)

Reply via email to