https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56096
--- Comment #7 from Jeffrey A. Law <law at gcc dot gnu.org> ---
So looking at this from a RISC-V standpoint....
To improve this for RISC-V the first thing we need to realize is the constant
requires synthesis. But in this context there are things we can do.
For rv64gcbv we currently generate:
li a5,32768
addi a5,a5,128
and a1,a1,a5
snez a1,a1
slliw a1,a1,3
srlw a0,a0,a1
We can logically shift a1 by 7 positions right and shift the constant as well.
This is safe as we just care about zero/nonzero state of the AND, not precisely
what bits are set/not set. So something like this:
srli a1,a1,7
andi a1,a1,257
snez a1,a1
slliw a1,a1,3
srlw a0,a0,a1
It's probably not any faster on most designs given the li/addi is a classic
case for fusion. But it improves code density a bit. This is somewhat painful
to realize because of the mvconst_internal pattern. Even without
mvconst_internal the result likely needs to be a define_insn_and_split becuase
it's going to generate 3 insns.
We can then use li+czero to replace the snez+slliw resulting in:
srli a1,a1,7
andi a1,a1,257
li a5,8
czero.eqz a1,a5,a1
srlw a0,a0,a1
That's marginally better because the li has no incoming dependencies and can
issue whenever it's convenient for the uarch.