On 12/19/2025 11:15 AM, Vineet Gupta wrote:
              Before            After
       ---------------------+----------------------
         bge a0,zero,.L2    | slti      a0,a0,0
                            | czero.eqz a0,a0,a0
         xor a1,a1,a3       | xor       a0,a0,a0
       .L2                  |
         mv  a0,a1          |
         ret                | ret

This is what all the prev NFC patches have been preparing to get to.

Currently the cond arith code only handles EQ/NE zero conditions missing
ifcvt optimization for cases such as GE zero, as show in example above.

The actual change is to switch from noce_emit_czero () to noce_emit_cmove ()
which can handle conditions other than EQ/NE and and if needed generate
additional supporting insns such as SLT for those conditions.

This also allows us to remove the constraint at the entry to limit to EQ/NE
conditions, improving ifcvt outcomes in general.

gcc/ChangeLog:

        * ifcvt.c (noce_try_cond_zero_arith): Use noce_emit_cmove.
        Delete noce_emit_czero () no longer used.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/pr122769.c: New test.

Co-authored-by: Philipp Tomsich <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
P.S. Jeff had added an old patch from Vrull as a reference to PR, which
fixed the test. I used the central piece from that patch as a ref for
this fix, hence the co-authored tag.

Makes sense.  One of the key things we realized with this code is when it's using gen_rtx_* and similar constructs it's probably a mistake.  Instead it should be routing through either an expander or something like noce_emit_cmove which can handle canonicalization issues.


 That concept also applies to better handling of extensions and subregs, but IMHO general improvement of sub-word accesses is probably a gcc-17 thing.  It was the single biggest source of missed optimizations we saw when doing some post-processing analysis of QEMU data.  Essentially we had code to analyze QEMU block data to identify SFB-like sequences as a proxy for missed conditional moves.  It wasn't perfect, but it clearly showed that once the basics work well (such as canonicalization of the condition), then biggest gap is those pesky sub-word cases.


OK for the trunk.


Jeff

Reply via email to