Some functional change as was already posted, this time with a testcase.  Given it's been in my tester and through the pre-commit CI system, I'm going forward now.

--

So as the PR notes, this is an attempt to squeeze out some instructions from a hot part of leela, the random number generator in particular.

typedef unsigned int uint32;


uint32 random(uint32 s1) {
    const uint32 mask = 0xffffffff;
    s1 = (((s1 & 0xFFFFFFFEU) << 12) & mask);
    return s1;
}

Generates this RISC-V code:

        slli    a5,a0,44        # 25    [c=4 l=4]  ashldi3
        srai    a0,a5,44        # 26    [c=4 l=4]  ashrdi3
        andi    a0,a0,-2        # 21    [c=4 l=4]  *anddi3/1
        slli    a0,a0,12        # 22    [c=4 l=4]  ashldi3

But this is an equivalent sequence:

        andi    a0, a0, -2
        slliw   a0, a0, 12

The key is realizing that the the first two statements are just a sign extended bitfield of length 20.  That ultimately gets shifted left 12 bits.  20+12 = 32, so we can at least conceptually use slliw (shift left sign extending result from SI to DI).  The andi just turns off the low bit.

Given a sign extracted bitfield starting at bit 0, of size N that is then left shifted by M where N+M == 32 is a natural slliw instruction.  However, when I tried to recognize that and generate the slliw form I saw code quality regressions that didn't look particularly reasonable to try and fix.   So we want to be more selective about recognizing that idiom.  So we recognize it when we subsequently mask off some bits and the mask can be encoded via andi.  This likely could be extended to other logical operations that don't ultimately affect the SI sign bit.

Jeff
commit 046bc3484c90a25fb09c851d6afac13b790bb20c
Author: Jeff Law <[email protected]>
Date:   Fri May 8 11:40:29 2026 -0600

    [V2][RISC-V][PR target/124955] Utilize slliw for some left shifted signed 
bitfield extractions
    
    Some functional change as was already posted, this time with a testcase.  
Given
    it's been in my tester and through the pre-commit CI system, I'm going 
forward
    now.
    
    --
    
    So as the PR notes, this is an attempt to squeeze out some instructions 
from a
    hot part of leela, the random number generator in particular.
    
    typedef unsigned int uint32;
    
    uint32 random(uint32 s1) {
        const uint32 mask = 0xffffffff;
        s1 = (((s1 & 0xFFFFFFFEU) << 12) & mask);
        return s1;
    }
    
    Generates this RISC-V code:
    
            slli    a5,a0,44        # 25    [c=4 l=4]  ashldi3
            srai    a0,a5,44        # 26    [c=4 l=4]  ashrdi3
            andi    a0,a0,-2        # 21    [c=4 l=4]  *anddi3/1
            slli    a0,a0,12        # 22    [c=4 l=4]  ashldi3
    
    But this is an equivalent sequence:
    
            andi    a0, a0, -2
            slliw   a0, a0, 12
    
    The key is realizing that the the first two statements are just a sign 
extended
    bitfield of length 20.  That ultimately gets shifted left 12 bits.  20+12 = 
32,
    so we can at least conceptually use slliw (shift left sign extending result
    from SI to DI).  The andi just turns off the low bit.
    
    Given a sign extracted bitfield starting at bit 0, of size N that is then 
left
    shifted by M where N+M == 32 is a natural slliw instruction.  However, when 
I
    tried to recognize that and generate the slliw form I saw code quality
    regressions that didn't look particularly reasonable to try and fix.   So we
    want to be more selective about recognizing that idiom.  So we recognize it
    when we subsequently mask off some bits and the mask can be encoded via 
andi.
    This likely could be extended to other logical operations that don't 
ultimately
    affect the SI sign bit.
    
            PR target/124955
    gcc/
    
            * config/riscv/riscv.md (masked shifted bitfield extraction): New
            splitter to utilize slliw to eliminate the need for sign extnesion.
    
    gcc/testsuite/
    
            * gcc.target/riscv/pr124955.c: New test

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index b7fba2e88a3..869061e18ae 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -5184,6 +5184,36 @@ (define_split
    (set (match_dup 0) (eq:DI (match_dup 2) (const_int 0)))]
   { operands[1] = gen_lowpart (DImode, operands[1]); })
 
+;; The basic idea is to realize that we can get the sign extension
+;; for free when sign extracting a field shifting it such that
+;; the sign bit of the field ends up in the SI sign bit.  In that
+;; case it's just a slliw.
+;;
+;; It is tempting to do the extract+shift rewriting independent of
+;; the outer AND.  But that's shown to regress code quality in other
+;; contexts.  So we're being more conservative about trying to
+;; exploit the free sign extension opportunities that show up with
+;; shifted sign extractions
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+       (and:DI
+        (ashift:DI (sign_extract:DI (match_operand:DI 1 "register_operand")
+                                    (match_operand 2 "const_int_operand")
+                                    (match_operand 3 "const_int_operand"))
+                   (match_operand 4 "const_int_operand"))
+        (match_operand:DI 5 "const_int_operand")))
+   (clobber (match_operand:DI 6 "register_operand"))]
+  "(TARGET_64BIT
+    && INTVAL (operands[2]) + INTVAL (operands[4]) == 32
+    && SMALL_OPERAND (INTVAL (operands[5]) >> INTVAL (operands[4])))"
+  [(set (match_dup 6) (and:DI (match_dup 1) (match_dup 5)))
+   (set (match_dup 0) (sign_extend:DI (ashift:SI (match_dup 7) (match_dup 
4))))]
+{
+  HOST_WIDE_INT new_mask = INTVAL (operands[5]) >> INTVAL (operands[4]);
+  operands[5] = GEN_INT (new_mask);
+  operands[7] = gen_lowpart (SImode, operands[6]);
+})
+
 (include "bitmanip.md")
 (include "crypto.md")
 (include "sync.md")
diff --git a/gcc/testsuite/gcc.target/riscv/pr124955.c 
b/gcc/testsuite/gcc.target/riscv/pr124955.c
new file mode 100644
index 00000000000..db6a08b3878
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr124955.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target rv64} } */
+/* { dg-additional-options "-march=rv64gc_zicond -mabi=lp64d" } */
+
+typedef unsigned int uint32;
+
+uint32 random(uint32 s1) {
+    const uint32 mask = 0xffffffff;
+    s1 = (((s1 & 0xFFFFFFFEU) << 12) & mask);
+    return s1;
+}
+
+/* { dg-final { scan-assembler-not "slli\t" } } */
+/* { dg-final { scan-assembler-not "srai\t" } } */
+/* { dg-final { scan-assembler-times "slliw\t" 1 } } */

Reply via email to