[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-29 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

--- Comment #11 from Michael Meissner  ---
Author: meissner
Date: Mon Jan 29 22:30:34 2018
New Revision: 257166

URL: https://gcc.gnu.org/viewcvs?rev=257166=gcc=rev
Log:
2018-01-29  Michael Meissner  

PR target/81550
* config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): If DFmode
and SFmode can go in Altivec registers (-mcpu=power7 for DFmode,
-mcpu=power8 for SFmode) don't set the PRE_INCDEC or PRE_MODIFY
flags.  This restores the settings used before the 2017-07-24.
Turning off pre increment/decrement/modify allows IVOPTS to
optimize DF/SF loops where the index is an int.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.c

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-24 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

Michael Meissner  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Michael Meissner  ---
Fixed in subversion id 257038.

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-24 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

--- Comment #9 from Michael Meissner  ---
Author: meissner
Date: Thu Jan 25 01:09:19 2018
New Revision: 257038

URL: https://gcc.gnu.org/viewcvs?rev=257038=gcc=rev
Log:
[gcc/testsuite]
2018-01-24  Michael Meissner  

PR target/81550
* gcc.target/powerpc/loop_align.c: Use unsigned long for the loop
index instead of int, which allows IVOPTs to properly optimize the
loop.


Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/powerpc/loop_align.c

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-23 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

--- Comment #8 from Segher Boessenkool  ---
Yes, but that does not work if ivopts decides to make a loop that cannot
work with bdnz ;-)

cc:ing Bin

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-23 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

--- Comment #7 from Michael Meissner  ---
I think the thing to do is make a shorter loop that won't get extended like the
double loop does.

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-23 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

--- Comment #6 from Michael Meissner  ---
This is a twisty little passage (all different).

The code is basically trying to test the TARGET_ASM_LOOP_ALIGN_MAX_SKIP target
hook.  It carefully aligns the functions to 16 bytes and then wants the normal
loop alignment to 32 bytes.

However, the test for the TARGET_ASM_LOOP_ALIGN_MAX_SKIP target hook (in
rs6000_loop_align_max_skip in rs6000.c) looks at the number of insns generated.

The number of insns generated is now more and the test fails.

Here is the code for 250481 on little endian:

f:
cmpwi 7,6,0
blelr 7
addi 6,6,-1
li 9,0
rldicl 6,6,0,32
addi 10,6,1
mtctr 10
.p2align 5,,31
.L3:
lfdx 0,4,9
lfdx 12,5,9
fadd 0,0,12
stfdx 0,3,9
addi 9,9,8
bdnz .L3
blr

Now the code for 250483 (and current trunk) looks like the following on little
endian:

f:
cmpwi 7,6,0
blelr 7
addi 6,6,-1
addi 9,4,-8
rldic 6,6,3,29
addi 5,5,-8
add 4,4,6
addi 3,3,-8
.p2align 4,,15
.L3:
lfdu 0,8(9)
lfdu 12,8(5)
cmpld 7,9,4
fadd 0,0,12
stfdu 0,8(3)
beqlr 7
lfdu 0,8(9)
lfdu 12,8(5)
cmpld 7,9,4
fadd 0,0,12
stfdu 0,8(3)
bne 7,.L3
blr

There are now 12 insns in the loop, compared to 6 in the previous loop.  The
test is emitting the loop alignment if the # of insns in the loop is less than
8.

Big endian generates somewhat different code:

f:
.quad   .L.f,.TOC.@tocbase,0
.previous
.type   f, @function
.L.f:
cmpwi 7,6,0
blelr 7
addi 6,6,-1
addi 9,4,-8
rldic 6,6,3,29
addi 5,5,-8
add 4,4,6
addi 3,3,-8
.p2align 4,,15
.L3:
addi 9,9,8
addi 5,5,8
lfd 0,0(9)
lfd 12,0(5)
cmpld 7,9,4
addi 3,3,8
fadd 0,0,12
stfd 0,0(3)
beqlr 7
addi 9,9,8
addi 5,5,8
lfd 0,0(9)
lfd 12,0(5)
cmpld 7,9,4
addi 3,3,8
fadd 0,0,12
stfd 0,0(3)
bne 7,.L3

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-23 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

Michael Meissner  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |meissner at gcc dot 
gnu.org

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-23 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #5 from Segher Boessenkool  ---
The root cause here is we no longer use bdnz.  The loop2_doloop dump
file says:

Doloop: Possible infinite iteration case.
Doloop: The loop is not suitable.

(In the source code, the loop can not be infinite; in the RTL it
probably can, I did not check).

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-23 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

--- Comment #4 from Michael Meissner  ---
I must have typed the wrong numbers, as revision 250482 is indeed the revision
that it breaks.

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-23 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

--- Comment #3 from Michael Meissner  ---
It isn't actually subversion id 250482 that causes the problem.  I've built
250481 and 250483 compilers and there is no difference in code.  I had 252844,
and it shows the problem.

The difference between the two is 250481 generates code that allows PRE_INC,
PRE_DEC, and PRE_MODIFY on DFmode values for power7.

Now, in theory it should not allow PRE_* on DFmode, since power7 supports
DFmode in both traditional FPR registers and altivec registers.  The
traditional FPR loads and store support PRE_* forms of the instruction, but the
VSX loads used for the Altivec registers don't support PRE_* (the original form
of the instructions supported it, but it was removed the hardware shipped).

The debug code (-mdebug=reg) shows that in theory the PRE_* support was turned
off, but somewhere it getting turned back on and the lfdu is generated.

If you compile the code with -mno-update, it generates the same code.

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
The patch changed what ivopts does on the loop, previous:
  # ivtmp.13_22 = PHI 
  _4 = MEM[base: b_13(D), index: ivtmp.13_22, offset: 0B];
  _6 = MEM[base: c_14(D), index: ivtmp.13_22, offset: 0B];
  _8 = _4 + _6;
  MEM[base: a_15(D), index: ivtmp.13_22, offset: 0B] = _8;
  ivtmp.13_26 = ivtmp.13_22 + 8;
  if (ivtmp.13_26 != _18)
is now:
  # ivtmp.4_22 = PHI 
  # ivtmp.6_23 = PHI 
  # ivtmp.8_9 = PHI 
  ivtmp.4_26 = ivtmp.4_22 + 8;
  _31 = (void *) ivtmp.4_26;
  _4 = MEM[base: _31, offset: 0B];
  ivtmp.6_19 = ivtmp.6_23 + 8;
  _30 = (void *) ivtmp.6_19;
  _6 = MEM[base: _30, offset: 0B];
  _8 = _4 + _6;
  ivtmp.8_32 = ivtmp.8_9 + 8;
  _29 = (void *) ivtmp.8_32;
  MEM[base: _29, offset: 0B] = _8;
  if (ivtmp.4_26 != _39)
so the loop is now longer and in addition to that the bbro pass decides to
duplicate it with conditional return in the middle.  If I add a call to some
function at the end of the function, it doesn't do this anymore, but still the
loop has 9 instructions instead of 6 before and thus is over the
rs6000_loop_align 5..8 insns limit.

Mike, so what exactly changed and why don't look the lfdx and stfdx
instructions look desirable for ivopts?

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2018-01-10 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2017-12-19 Thread aldyh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

Aldy Hernandez  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-12-19
 CC||aldyh at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Aldy Hernandez  ---
Confirmed.

[Bug target/81550] [8 regression] gcc.target/powerpc/loop_align.c fails starting with r250482

2017-07-26 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81550

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |8.0