So as the PR notes, this is an attempt to squeeze out some instructions from a hot part of leela, the random number generator in particular.

typedef unsigned int uint32;


uint32 random(uint32 s1) {
    const uint32 mask = 0xffffffff;
    s1 = (((s1 & 0xFFFFFFFEU) << 12) & mask);
    return s1;
}

Generates this RISC-V code:

        slli    a5,a0,44        # 25    [c=4 l=4]  ashldi3
        srai    a0,a5,44        # 26    [c=4 l=4]  ashrdi3
        andi    a0,a0,-2        # 21    [c=4 l=4]  *anddi3/1
        slli    a0,a0,12        # 22    [c=4 l=4]  ashldi3

But this is an equivalent sequence:

        andi    a0, a0, -2
        slliw   a0, a0, 12

The key is realizing that the the first two statements are just a sign extended bitfield of length 20.  That ultimately gets shifted left 12 bits.  20+12 = 32, so we can at least conceptually use slliw (shift left sign extending result from SI to DI).  The andi just turns off the low bit.

Given a sign extracted bitfield starting at bit 0, of size N that is then left shifted by M where N+M == 32 is a natural slliw instruction.  However, when I tried to recognize that and generate the slliw form I saw code quality regressions that didn't look particularly reasonable to try and fix.   So we want to be more selective about recognizing that idiom.  So we recognize it when we subsequently mask off some bits and the mask can be encoded via andi.  This likely could be extended to other logical operations that don't ultimately affect the SI sign bit.

So here's the patch I'm playing with right now.  It's passed riscv32-elf and riscv64-elf.  Bootstrap on the BPI and Pioneer is in progress.  I'm posting it now to get the CI system chewing on it overnight.

Jeff
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 4a49e778fed5..b6c29db13c1d 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -5168,8 +5168,37 @@ (define_insn "*sign_bit_splat_equality_test"
 }
   [(set_attr "type" "branch")
    (set_attr "mode" "none")])
-                                       
-       
+
+
+;; The basic idea is to realize that we can get the sign extension
+;; for free when sign extracting a field shifting it such that
+;; the sign bit of the field ends up in the SI sign bit.  In that
+;; case it's just a slliw.
+;;
+;; It is tempting to do the extract+shift rewriting independent of
+;; the outer AND.  But that's shown to regress code quality in other
+;; contexts.  So we're being more conservative about trying to
+;; exploit the free sign extension opportunities that show up with
+;; shifted sign extractions
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+       (and:DI
+        (ashift:DI (sign_extract:DI (match_operand:DI 1 "register_operand")
+                                    (match_operand 2 "const_int_operand")
+                                    (match_operand 3 "const_int_operand"))
+                   (match_operand 4 "const_int_operand"))
+        (match_operand:DI 5 "const_int_operand")))
+   (clobber (match_operand:DI 6 "register_operand"))]
+  "(TARGET_64BIT
+    && INTVAL (operands[2]) + INTVAL (operands[4]) == 32
+    && SMALL_OPERAND (INTVAL (operands[5]) >> INTVAL (operands[4])))"
+  [(set (match_dup 6) (and:DI (match_dup 1) (match_dup 5)))
+   (set (match_dup 0) (sign_extend:DI (ashift:SI (match_dup 7) (match_dup 
4))))]
+{
+  HOST_WIDE_INT new_mask = INTVAL (operands[5]) >> INTVAL (operands[4]);
+  operands[5] = GEN_INT (new_mask);
+  operands[7] = gen_lowpart (SImode, operands[6]);
+})
 
 ;; Standard extensions and pattern for optimization
 (include "bitmanip.md")

Reply via email to