https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56096

--- Comment #7 from Jeffrey A. Law <law at gcc dot gnu.org> ---
So looking at this from a RISC-V standpoint....

To improve this for RISC-V the first thing we need to realize is the constant
requires synthesis.   But in this context there are things we can do.

For rv64gcbv we currently generate:

        li      a5,32768
        addi    a5,a5,128
        and     a1,a1,a5
        snez    a1,a1
        slliw   a1,a1,3
        srlw    a0,a0,a1

We can logically shift a1 by 7 positions right and shift the constant as well.
This is safe as we just care about zero/nonzero state of the AND, not precisely
what bits are set/not set.  So something like this:

        srli    a1,a1,7
        andi    a1,a1,257
        snez    a1,a1
        slliw   a1,a1,3
        srlw    a0,a0,a1

It's probably not any faster on most designs given the li/addi is a classic
case for fusion.  But it improves code density a bit.  This is somewhat painful
to realize because of the mvconst_internal pattern.  Even without
mvconst_internal the result likely needs to be a define_insn_and_split becuase
it's going to generate 3 insns.

We can then use li+czero to replace the snez+slliw resulting in:

        srli    a1,a1,7
        andi    a1,a1,257
        li      a5,8
        czero.eqz a1,a5,a1
        srlw    a0,a0,a1

That's marginally better because the li has no incoming dependencies and can
issue whenever it's convenient for the uarch.

Reply via email to