So as noted in the PR, GCC fails to optimize this well:


int8_t f(int8_t x)
{
    int8_t sh = 1 << x;
    return sh & 1;
}

I strongly suspect this kind of code is exceedingly rare in practice.  I just happened to notice that it could be improved when looking for bugs to pass along to Shreya & Austin.  As noted in the PR, most of the time this is cleaned up in gimple, but in some cases it slips through.

I'd love to tackle this in simplify-rtx, but SHIFT_COUNT_TRUNCATED, mode handling for shift counts, subregs to deal with 32 bit objects on 64 bit targets, etc make it fairly messy.  Rather than spend a ton of time on it, I've just created a simple risc-v splitter to handle the case of a 32bit shift on rv64.  The other cases can't be optimized.



For rv64 we generate:

        li      a5,1            # 7     [c=4 l=4] *movsi_internal/1
        sllw    a0,a5,a0        # 8     [c=8 l=4]  ashlsi3_extend
        andi    a0,a0,1 # 17    [c=4 l=4]  *anddi3/1


Instead we can generate:


        andi    a0,a0,31        # 8     [c=4 l=4]  *anddi3/1
        seqz    a0,a0   # 17    [c=4 l=4]  *seq_zero_didi


I purposefully added the masking of the shift count.  While the RISC-V port does not define SHIFT_COUNT_TRUNCATED, it does have patterns that optimize away the the masking when they can.  If the masking got optimized away on the assumption the count would be used in a shift/rotate and thus masked by the hardware, we could have junk in the upper bits.   It's worth noting that because of the need to sanitize the shift count we're generating 2 insns, thus we can't really improve for rv32 or for 64 bit objects on rv64.  If we didn't need to do that this would be a define_insn that generated a single instruction.

Bootstrapped and regression tested on rv64 for on both the BPI and the Pioneer.  Also regression tested on riscv32-elf and riscv64-elf.  Planning to push once pre-commit CI gives the green light.

Jeff
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 4a49e778fed5..e4dac32d71db 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -5168,10 +5168,22 @@ (define_insn "*sign_bit_splat_equality_test"
 }
   [(set_attr "type" "branch")
    (set_attr "mode" "none")])
-                                       
-       
 
-;; Standard extensions and pattern for optimization
+;; We can save an instruction for this case.  Essentially we can
+;; test the (sanitized) shift count against zero.  This only comes
+;; up for 32 bit objects on rv64.
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+       (and:DI (subreg:DI
+                (ashift:SI (const_int 1)
+                           (match_operand:QI 1 "register_operand")) 0)
+               (const_int 1)))
+   (clobber (match_operand:DI 2 "register_operand"))]
+  "TARGET_64BIT"
+  [(set (match_dup 2) (and:DI (match_dup 1) (const_int 31)))
+   (set (match_dup 0) (eq:DI (match_dup 2) (const_int 0)))]
+  { operands[1] = gen_lowpart (DImode, operands[1]); })
+
 (include "bitmanip.md")
 (include "crypto.md")
 (include "sync.md")
diff --git a/gcc/testsuite/gcc.target/riscv/pr106244.c 
b/gcc/testsuite/gcc.target/riscv/pr106244.c
new file mode 100644
index 000000000000..18c2e1b49504
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr106244.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target rv64 } } */
+/* { dg-additional-options "-march=rv64gc_zicond -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og"} } */
+
+typedef signed char int8_t;
+int8_t f(int8_t x)
+{
+    int8_t sh = 1 << x;
+    return sh & 1;
+}
+
+/* { dg-final { scan-assembler-not "li\t" } } */
+/* { dg-final { scan-assembler-not "sllw\t" } } */
+/* { dg-final { scan-assembler-times "andi\t" 1 } } */
+/* { dg-final { scan-assembler-times "seqz\t" 1 } } */

Reply via email to