https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114603
--- Comment #1 from GCC Commits ---
The trunk branch has been updated by Richard Sandiford :
https://gcc.gnu.org/g:67cbb1c638d6ab3a9cb77e674541e2b291fb67df
commit r14-9811-g67cbb1c638d6ab3a9cb77e674541e2b291fb67df
Author: Richard Sandiford
Date: Fri Apr 5 14:47:15 2024 +0100
aarch64: Fix bogus cnot optimisation [PR114603]
aarch64-sve.md had a pattern that combined:
cmpeq pb.T, pa/z, zc.T, #0
mov zd.T, pb/z, #1
into:
cnotzd.T, pa/m, zc.T
But this is only valid if pa.T is a ptrue. In other cases, the
original would set inactive elements of zd.T to 0, whereas the
combined form would copy elements from zc.T.
gcc/
PR target/114603
* config/aarch64/aarch64-sve.md (@aarch64_pred_cnot): Replace
with...
(@aarch64_ptrue_cnot): ...this, requiring operand 1 to be
a ptrue.
(*cnot): Require operand 1 to be a ptrue.
* config/aarch64/aarch64-sve-builtins-base.cc
(svcnot_impl::expand):
Use aarch64_ptrue_cnot for _x operations that are predicated
with a ptrue. Represent other _x operations as fully-defined _m
operations.
gcc/testsuite/
PR target/114603
* gcc.target/aarch64/sve/acle/general/cnot_1.c: New test.