https://gcc.gnu.org/g:78247a33643d20778c9df8d04f5c41b534880c88

commit 78247a33643d20778c9df8d04f5c41b534880c88
Author: Jeff Law <[email protected]>
Date:   Thu Nov 6 12:16:10 2025 -0700

    [RISC-V] Ignore useless zero-initialization in conditional move sequence 
costing
    
    Let's consider this code fragment (from xalan):
    
    bool isValidAncestorType(int type) {
        if (type == 0 || type == 6 || type == 4) {
                return true;
        }
        return false;
    }
    
    Right now it generates something like this for RISC-V with Zicond enabled:
    
    >         li      a5,6
    >         bgtu    a0,a5,.L3
    >         li      a5,81
    >         bext    a0,a5,a0
    >         ret
    > .L3:
    >         li      a0,0
    >         ret
    
    But there's a clearly better sequence.  LLVM for example generates:
    
    >         addi    a1, a0, -6
    >         seqz    a1, a1
    >         andi    a0, a0, -5
    >         seqz    a0, a0
    >         or      a0, a1, a0
    
    And with a little work GCC can generate this (which is probably better than
    LLVM's codegen ever-so-slightly):
    
    >         li      a4,6            # 8     [c=4 l=4]  *movdi_64bit/1
    >         li      a5,81           # 33    [c=4 l=4]  *movdi_64bit/1
    >         bext    a5,a5,a0        # 36    [c=4 l=4]  *bextdi
    >         sgtu    a0,a0,a4        # 38    [c=4 l=4]  *sgtu_didi
    >         czero.nez       a0,a5,a0        # 39    [c=4 l=4]  *czero.nez.didi
    
    There's multiple ways Daniel B. and I looked at to resolve this.  At the 
core
    the failure to if-convert is a cost model problem.
    
    We could try to generate more efficient code out of cmove_arith.  What we're
    currently getting out of that routine is horribly bad:
    
    ```
    > (insn 10 3 34 2 (set (reg:DI 138)
    >         (const_int 6 [0x6])) 277 {*movdi_64bit}
    >      (nil))
    > (insn 34 10 35 2 (set (reg:DI 145)
    >         (const_int 0 [0])) 277 {*movdi_64bit}
    >      (nil)) (insn 35 34 36 2 (set (reg:DI 141)
    >         (const_int 81 [0x51])) 277 {*movdi_64bit}
    >      (nil))
    > (insn 36 35 37 2 (set (reg:DI 140 [ _3 ])
    >         (lshiftrt:DI (reg:DI 141)                    (subreg:QI (reg/v:DI 
137 [ type ]) 0))) 301 {lshrdi3}
    >      (nil))
    > (insn 37 36 38 2 (set (reg:DI 143)
    >         (and:DI (reg:DI 140 [ _3 ])
    >             (const_int 1 [0x1]))) 106 {*anddi3}
    >      (nil))
    > (insn 38 37 40 2 (set (reg:DI 146)
    >         (zero_extend:DI (subreg:QI (reg:DI 143) 0))) 126 
{*zero_extendqidi2_internal}
    >      (nil)) (insn 40 38 41 2 (set (reg:DI 147)
    >         (gtu:DI (reg/v:DI 137 [ type ])
    >             (reg:DI 138))) 428 {*sgtu_didi}
    >      (nil))
    > (insn 41 40 42 2 (set (reg:DI 149)         (if_then_else:DI (ne:DI 
(reg:DI 147)
    >                 (const_int 0 [0]))
    >             (const_int 0 [0])
    >             (reg:DI 146))) 40369 {*czero.nez.didi}
    >      (nil))
    > (insn 42 41 43 2 (set (reg:DI 148)
    >         (if_then_else:DI (eq:DI (reg:DI 147)
    >                 (const_int 0 [0]))             (const_int 0 [0])
    >             (reg:DI 145))) 40368 {*czero.eqz.didi}
    >      (nil))
    > (insn 43 42 25 2 (set (reg:DI 136 [ <retval> ])
    >         (plus:DI (reg:DI 148)
    >             (reg:DI 149))) 5 {*adddi3}
    
    ```
    
    In particular note insn 34, 42 and 43.  Those are useless.  Insns 36, 37, 38
    are just a single bit extraction from a variable location (from one of the
    if-converted blocks).  I couldn't see a good way to fix the problem with 
insn
    34/insn 42.  The desire to make the then/else blocks independent is 
cmove_arith
    is good, what's unclear is whether or not that code really cares about the
    *destination* of the then/else blocks. But I set that aside.
    
    We then thought that cleaning up the variable bit extraction would be the 
way
    to go.  So a pattern was constructed to match that form of variable bit 
extract
    and the cost model was twiddled to return that it was a single fast
    instruction.  But even with those changes fwprop1 refused to make the
    substitution.  Sigh.  At least combine recognizes the idiom later and 
cleans it
    up.
    
    Then we realized we really should just ignore the (set (reg) (const_int 0)) 
in
    the if-converted sequence.  We're going to be able to propagate that away in
    nearly every case since we have a hard-wired zero register.  Sure enough,
    ignoring that insn was enough to tip the balance on this case and we get the
    desired code.
    
    Tested on riscv32-elf and riscv64-elf.  Pioneer bootstrap is in flight, 
though
    it won't really exercise this problem.  The BPI's build hasn't started yet, 
so
    it'll be at least 27hours before it's done.
    
    Waiting on pre-commit CI before moving forward.
    
    gcc/
    
            * config/riscv/riscv.cc (riscv_noce_conversion_profitable_p): Ignore
            assignments of (const_int 0) to a register.  They will get 
propagated
            away.
    
    gcc/testsuite
            * gcc.target/riscv/czero-bext.c: New test.
    
    (cherry picked from commit d277d4e3745390a7f3301f6fff19153b89175a0c)

Diff:
---
 gcc/config/riscv/riscv.cc                   |  7 +++++++
 gcc/testsuite/gcc.target/riscv/czero-bext.c | 17 +++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 64a43c0a7c21..72035b618acd 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4722,6 +4722,13 @@ riscv_noce_conversion_profitable_p (rtx_insn *seq,
              if (last_dest)
                last_dest = dest;
            }
+         else if (REG_P (dest) && src == CONST0_RTX (GET_MODE (dest)))
+           {
+             /* A GPR set to zero can always be replaced with x0, so any
+                insn that sets a GPR to zero will eventually be eliminated.  */
+             riscv_if_info.original_cost += COSTS_N_INSNS (1);
+             riscv_if_info.max_seq_cost += COSTS_N_INSNS (1);
+           }
          else
            last_dest = NULL_RTX;
 
diff --git a/gcc/testsuite/gcc.target/riscv/czero-bext.c 
b/gcc/testsuite/gcc.target/riscv/czero-bext.c
new file mode 100644
index 000000000000..54a6d6f175f9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/czero-bext.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=rv64gbc_zicond -mabi=lp64d" { target { rv64 } } } 
*/
+/* { dg-options "-O2 -march=rv32gbc_zicond -mabi=ilp32" { target { rv32 } } } 
*/
+
+bool isValidAncestorType(int type) {
+    if (type == 0 || type == 6 || type == 4) {
+            return true;
+    }
+    return false;
+}
+
+
+
+/* { dg-final { scan-assembler "czero.nez\t" } } */
+/* { dg-final { scan-assembler "sgtu\t" } } */
+/* { dg-final { scan-assembler-not "bgtu\t" } } */
+

Reply via email to