[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 --- Comment #11 from Michael Meissner --- Author: meissner Date: Mon Jan 29 22:30:34 2018 New Revision: 257166 URL: https://gcc.gnu.org/viewcvs?rev=257166=gcc=rev Log: 2018-01-29 Michael MeissnerPR target/81550 * config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): If DFmode and SFmode can go in Altivec registers (-mcpu=power7 for DFmode, -mcpu=power8 for SFmode) don't set the PRE_INCDEC or PRE_MODIFY flags. This restores the settings used before the 2017-07-24. Turning off pre increment/decrement/modify allows IVOPTS to optimize DF/SF loops where the index is an int. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.c
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 Michael Meissner changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #10 from Michael Meissner --- Fixed in subversion id 257038.
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 --- Comment #9 from Michael Meissner --- Author: meissner Date: Thu Jan 25 01:09:19 2018 New Revision: 257038 URL: https://gcc.gnu.org/viewcvs?rev=257038=gcc=rev Log: [gcc/testsuite] 2018-01-24 Michael MeissnerPR target/81550 * gcc.target/powerpc/loop_align.c: Use unsigned long for the loop index instead of int, which allows IVOPTs to properly optimize the loop. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/powerpc/loop_align.c
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 --- Comment #8 from Segher Boessenkool --- Yes, but that does not work if ivopts decides to make a loop that cannot work with bdnz ;-) cc:ing Bin
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 --- Comment #7 from Michael Meissner --- I think the thing to do is make a shorter loop that won't get extended like the double loop does.
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 --- Comment #6 from Michael Meissner --- This is a twisty little passage (all different). The code is basically trying to test the TARGET_ASM_LOOP_ALIGN_MAX_SKIP target hook. It carefully aligns the functions to 16 bytes and then wants the normal loop alignment to 32 bytes. However, the test for the TARGET_ASM_LOOP_ALIGN_MAX_SKIP target hook (in rs6000_loop_align_max_skip in rs6000.c) looks at the number of insns generated. The number of insns generated is now more and the test fails. Here is the code for 250481 on little endian: f: cmpwi 7,6,0 blelr 7 addi 6,6,-1 li 9,0 rldicl 6,6,0,32 addi 10,6,1 mtctr 10 .p2align 5,,31 .L3: lfdx 0,4,9 lfdx 12,5,9 fadd 0,0,12 stfdx 0,3,9 addi 9,9,8 bdnz .L3 blr Now the code for 250483 (and current trunk) looks like the following on little endian: f: cmpwi 7,6,0 blelr 7 addi 6,6,-1 addi 9,4,-8 rldic 6,6,3,29 addi 5,5,-8 add 4,4,6 addi 3,3,-8 .p2align 4,,15 .L3: lfdu 0,8(9) lfdu 12,8(5) cmpld 7,9,4 fadd 0,0,12 stfdu 0,8(3) beqlr 7 lfdu 0,8(9) lfdu 12,8(5) cmpld 7,9,4 fadd 0,0,12 stfdu 0,8(3) bne 7,.L3 blr There are now 12 insns in the loop, compared to 6 in the previous loop. The test is emitting the loop alignment if the # of insns in the loop is less than 8. Big endian generates somewhat different code: f: .quad .L.f,.TOC.@tocbase,0 .previous .type f, @function .L.f: cmpwi 7,6,0 blelr 7 addi 6,6,-1 addi 9,4,-8 rldic 6,6,3,29 addi 5,5,-8 add 4,4,6 addi 3,3,-8 .p2align 4,,15 .L3: addi 9,9,8 addi 5,5,8 lfd 0,0(9) lfd 12,0(5) cmpld 7,9,4 addi 3,3,8 fadd 0,0,12 stfd 0,0(3) beqlr 7 addi 9,9,8 addi 5,5,8 lfd 0,0(9) lfd 12,0(5) cmpld 7,9,4 addi 3,3,8 fadd 0,0,12 stfd 0,0(3) bne 7,.L3
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 Michael Meissner changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |meissner at gcc dot gnu.org
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comment #5 from Segher Boessenkool --- The root cause here is we no longer use bdnz. The loop2_doloop dump file says: Doloop: Possible infinite iteration case. Doloop: The loop is not suitable. (In the source code, the loop can not be infinite; in the RTL it probably can, I did not check).
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 --- Comment #4 from Michael Meissner --- I must have typed the wrong numbers, as revision 250482 is indeed the revision that it breaks.
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 --- Comment #3 from Michael Meissner --- It isn't actually subversion id 250482 that causes the problem. I've built 250481 and 250483 compilers and there is no difference in code. I had 252844, and it shows the problem. The difference between the two is 250481 generates code that allows PRE_INC, PRE_DEC, and PRE_MODIFY on DFmode values for power7. Now, in theory it should not allow PRE_* on DFmode, since power7 supports DFmode in both traditional FPR registers and altivec registers. The traditional FPR loads and store support PRE_* forms of the instruction, but the VSX loads used for the Altivec registers don't support PRE_* (the original form of the instructions supported it, but it was removed the hardware shipped). The debug code (-mdebug=reg) shows that in theory the PRE_* support was turned off, but somewhere it getting turned back on and the lfdu is generated. If you compile the code with -mno-update, it generates the same code.
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- The patch changed what ivopts does on the loop, previous: # ivtmp.13_22 = PHI_4 = MEM[base: b_13(D), index: ivtmp.13_22, offset: 0B]; _6 = MEM[base: c_14(D), index: ivtmp.13_22, offset: 0B]; _8 = _4 + _6; MEM[base: a_15(D), index: ivtmp.13_22, offset: 0B] = _8; ivtmp.13_26 = ivtmp.13_22 + 8; if (ivtmp.13_26 != _18) is now: # ivtmp.4_22 = PHI # ivtmp.6_23 = PHI # ivtmp.8_9 = PHI ivtmp.4_26 = ivtmp.4_22 + 8; _31 = (void *) ivtmp.4_26; _4 = MEM[base: _31, offset: 0B]; ivtmp.6_19 = ivtmp.6_23 + 8; _30 = (void *) ivtmp.6_19; _6 = MEM[base: _30, offset: 0B]; _8 = _4 + _6; ivtmp.8_32 = ivtmp.8_9 + 8; _29 = (void *) ivtmp.8_32; MEM[base: _29, offset: 0B] = _8; if (ivtmp.4_26 != _39) so the loop is now longer and in addition to that the bbro pass decides to duplicate it with conditional return in the middle. If I add a call to some function at the end of the function, it doesn't do this anymore, but still the loop has 9 instructions instead of 6 before and thus is over the rs6000_loop_align 5..8 insns limit. Mike, so what exactly changed and why don't look the lfdx and stfdx instructions look desirable for ivopts?
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 Aldy Hernandez changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2017-12-19 CC||aldyh at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Aldy Hernandez --- Confirmed.
[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550 Richard Biener changed: What|Removed |Added Target Milestone|--- |8.0