https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98673
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail| |10.2.1
See Also| |https://gcc.gnu.org/bugzill
| |a/show_bug.cgi?id=70359
Known to work| |11.0
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
So with your testcase on trunk I see for RISCV
ble a1,zero,.L2
li a6,4
li a5,0
sub a6,a6,a2
.L5:
lw a4,4(a2)
slli a7,a5,2
add t1,a6,a2
addi a5,a5,1
ble a4,a0,.L3
lw t3,0(a2)
ble t3,a0,.L9
.L3:
addi a2,a2,4
bne a1,a5,.L5
which is fine, same for x86. This is usually a SSA coalescing issue where
a failed coalesce ends up splitting the backedge and emitting a move there.
I can see the issue on the branch where the problematic one is
;; basic block 4, loop depth 1
;; pred: 3
;; 7
# i_57 = PHI <0(3), i_41(7)>
...
;; basic block 7, loop depth 1
;; pred: 4
;; 5
i_41 = i_57 + 1;
ivtmp.14_90 = ivtmp.14_91 + 4;
if (_6 != i_41)
goto <bb 4>; [94.50%]
else
goto <bb 8>; [5.50%]
;; succ: 4
;; 8
;; basic block 8, loop depth 0
;; pred: 7
_87 = (sizetype) i_57;
_146 = _87 + 2;
which is a use of the pre-increment i_57 on the loop exit edge. This
inhibits coalescing of i_57 and i_41 causing the copy.
That's exactly the issue noted in the cited PRs. There have been patches
floating around re-materializing i_41 + 1 at the point of i_57 to make
the coalescing possible but I think nobody developed them in full.
See the thread starting at
https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495843.html