https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64081
--- Comment #53 from Aldy Hernandez <aldyh at gcc dot gnu.org> --- Created attachment 40690 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40690&action=edit reduced testcase with rtl dumps and assembly Ughh, that was painful. The attached .tar.gz file has a reduced testcase.ii as well as the RTL dumps and testcase.s. The testcase looks big, but it only generates one small function: crapola. I hope I didn't botch something in the reduction. The order of loop is somewhat changed, but I think still exhibits the behavior. That is: [snip] addi 7,5,-1 ;; r7 = length - 1 lwz 3,4(9) addi 8,8,8 mtctr 7 ;; init CTR to length - 1 li 10,1 ;; j li 4,1 .align 4 L..3: lwzu 9,4(8) lwz 9,4(9) andis. 7,9,0x4000 beq 0,L..12 addi 10,10,1 ;; keep r10/j updated even though... bdnz L..3 ;; ...we iterate with CTR blr .align 4 L..12: lwz 6,0(3) lwz 7,0(6) cmpwi 7,7,0 bne- 7,L..13 rlwinm 9,10,29,3,29 rlwinm 7,10,0,27,31 add 9,6,9 addi 10,10,1 lwz 6,12(9) cmplw 7,5,10 ;; compare length and j slw 7,4,7 or 7,6,7 stw 7,12(9) bne+ 7,L..3 ;; branch on j/length but we fail to update CTR!! blr