On 10/8/25 12:54 PM, Vineet Gupta wrote:
On 10/8/25 09:47, Shreya Munnangi wrote:
In pr120811, we have cases where GCC is emitting an extra |addi| instruction
instead of using the 12-bit signed-immediate of |ld|.
|addi t1, t1, 1 ld t1, 0(t1) |
This problem occurs when |fp -> sp+offset| elimination results in an
out-of-range constant and we generate an address reload in LRA using
|addsi|/|adddi| expanders.
We've already adjusted the expanders to widen the set of valid operands to
allow more constants for the 2nd input operand. These expanders, rather than
constructing the constant into a register and using an |add| instruction, will
generate two |addi| instructions (or |shNadd|) during initial RTL generation.
We define a new pattern for cases where we need to access the current frame
and the offsets are too large. This gets reasonable code out of LRA in a form
|fold-mem-offsets| can handle, rather than having to wait for |sched2| to do
the height reduction transformation and leaving in the unnecessary |add|
instruction in the RTL stream.
To avoid the two |addi| instructions being squashed back together in the
post-reload combine, we remove the |adddi3_const_sum_of_two_s12| pattern.
RIP |adddi3_const_sum_of_two_s12, you served well :-)|
It definitely served us well for its relatively brief time in the tree.
So it seems -4096 is a special little snowflake.
Prev:
minus1:
addi a0,a0,-2048
addi a0,a0,-2048
ret
W/o Patch:
minus1:
li a5,-4096
add a0,a0,a5
ret
It is. I consider the newer code marginally better on OOO cores.
Essentially the li+add variant has one less data dependency than the
addi+addi variant. So the "li" can execute whenever it's convenient in
an OOO core. In contrast the first addi in the old code has to wait for
its input operand to become available. In an ideal case the li would
bubble up into an unused execution slot and execute "for free".
It probably doesn't matter in any meaningful way.
We are burning an extra register, perhaps something to improve for future ?
TM (4100)
TM (8200)
diff --git a/gcc/testsuite/gcc.target/riscv/pr120811.c
b/gcc/testsuite/gcc.target/riscv/pr120811.c
This causes a collision as the the file already sneaked into upstream from
commit
afdf44154fd "(RISC-V: Only Save/Restore required registers for ILP32E/LP64E)"
Yea, I think that was a mistake of mine. I'll clean that up.
Jeff