[Bug target/29256] [6/7/8 regression] loop performance regression

2018-02-19 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256

Wilco  changed:

   What|Removed |Added

 CC||wilco at gcc dot gnu.org

--- Comment #66 from Wilco  ---
(In reply to Aldy Hernandez from comment #65)
> (In reply to Jeffrey A. Law from comment #45)
> > This problem still exists and can be seen by making the arrays external and
> > using -fno-tree-loop-distribute-patterns.

> 
> Still a problem.  I get the same code Jeff got for comment 45.

A simple workaround for GCC8 might be to tweak the address costs when loop
unrolling is enabled, so offsets are preferred over indexing (either in IVOpt
or backend). For GCC9 a tree level loop optimization has been proposed which
will fix this issue.

[Bug target/29256] [6/7/8 regression] loop performance regression

2018-02-01 Thread aldyh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256

Aldy Hernandez  changed:

   What|Removed |Added

   Last reconfirmed|2013-12-09 04:50:02 |2018-2-1
 CC||aldyh at gcc dot gnu.org

--- Comment #65 from Aldy Hernandez  ---
(In reply to Jeffrey A. Law from comment #45)
> This problem still exists and can be seen by making the arrays external and
> using -fno-tree-loop-distribute-patterns.
> 
> .L2:
> evlddx 31,10,9
> addi 7,9,8
> addi 0,9,16
> addi 11,9,24
> addi 3,9,32
> evstddx 31,8,9
> addi 4,9,40
> evlddx 31,10,7
> addi 5,9,48
> addi 6,9,56
> evlddx 12,10,6
> addi 9,9,64
> evstddx 31,8,7
> evlddx 7,10,0
> evstddx 7,8,0
> evlddx 0,10,11
> evstddx 0,8,11
> evlddx 11,10,3
> evstddx 11,8,3
> evlddx 3,10,4
> evstddx 3,8,4
> evlddx 4,10,5
> evstddx 4,8,5
> evstddx 12,8,6
> bdnz .L2
> evldd 31,8(1)
> addi 1,1,16
> blr

Still a problem.  I get the same code Jeff got for comment 45.