Re: [PATCH 4/5] ifcvt: cond zero arith: elide short forward branch for signed GE 0 comparison [PR122769]

Jeffrey Law Sun, 21 Dec 2025 08:11:23 -0800


On 12/19/2025 11:15 AM, Vineet Gupta wrote:

              Before            After
       ---------------------+----------------------
         bge a0,zero,.L2    | slti      a0,a0,0
                            | czero.eqz a0,a0,a0
         xor a1,a1,a3       | xor       a0,a0,a0
       .L2                  |
         mv  a0,a1          |
         ret                | ret

This is what all the prev NFC patches have been preparing to get to.

Currently the cond arith code only handles EQ/NE zero conditions missing
ifcvt optimization for cases such as GE zero, as show in example above.

The actual change is to switch from noce_emit_czero () to noce_emit_cmove ()
which can handle conditions other than EQ/NE and and if needed generate
additional supporting insns such as SLT for those conditions.

This also allows us to remove the constraint at the entry to limit to EQ/NE
conditions, improving ifcvt outcomes in general.

gcc/ChangeLog:

        * ifcvt.c (noce_try_cond_zero_arith): Use noce_emit_cmove.
        Delete noce_emit_czero () no longer used.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/pr122769.c: New test.

Co-authored-by: Philipp Tomsich <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
P.S. Jeff had added an old patch from Vrull as a reference to PR, which
fixed the test. I used the central piece from that patch as a ref for
this fix, hence the co-authored tag.

Makes sense. One of the key things we realized with this code is whenit's using gen_rtx_* and similar constructs it's probably a mistake. Instead it should be routing through either an expander or somethinglike noce_emit_cmove which can handle canonicalization issues.

That concept also applies to better handling of extensions andsubregs, but IMHO general improvement of sub-word accesses is probably agcc-17 thing. It was the single biggest source of missed optimizationswe saw when doing some post-processing analysis of QEMU data. Essentially we had code to analyze QEMU block data to identify SFB-likesequences as a proxy for missed conditional moves. It wasn't perfect,but it clearly showed that once the basics work well (such ascanonicalization of the condition), then biggest gap is those peskysub-word cases.



OK for the trunk.


Jeff

Re: [PATCH 4/5] ifcvt: cond zero arith: elide short forward branch for signed GE 0 comparison [PR122769]

Reply via email to