[Bug tree-optimization/101280] [12 Regression] TSVC s231 slower with -Ofast -march=znver1 since r12-1836-g0ad9d88a3d7170b3

2021-10-13 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101280

--- Comment #10 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:5d2771db571323bc7ea25c61b5ed9e5309950f18

commit r9-9772-g5d2771db571323bc7ea25c61b5ed9e5309950f18
Author: Richard Biener 
Date:   Wed Jun 23 09:59:28 2021 +0200

tree-optimization/101173 - fix interchange dependence checking

This adjusts the loop interchange dependence checking to properly
guard all dependence checks with DDR_REVERSED_P or its inverse.

2021-07-07  Richard Biener  

PR tree-optimization/101173
PR tree-optimization/101280
* gimple-loop-interchange.cc
(tree_loop_interchange::valid_data_dependences): Properly
guard all dependence checks with DDR_REVERSED_P or its
inverse.

* gcc.dg/torture/pr101173.c: New testcase.

(cherry picked from commit e46ec6e243c704f0858d16af380a7d9c36fc4244)

[Bug tree-optimization/101280] [12 Regression] TSVC s231 slower with -Ofast -march=znver1 since r12-1836-g0ad9d88a3d7170b3

2021-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101280

--- Comment #9 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:ac6efdd70779a3be748d11c3b03c08df9ce15dd7

commit r10-10097-gac6efdd70779a3be748d11c3b03c08df9ce15dd7
Author: Richard Biener 
Date:   Wed Jun 23 09:59:28 2021 +0200

tree-optimization/101173 - fix interchange dependence checking

This adjusts the loop interchange dependence checking to properly
guard all dependence checks with DDR_REVERSED_P or its inverse.

2021-07-07  Richard Biener  

PR tree-optimization/101173
PR tree-optimization/101280
* gimple-loop-interchange.cc
(tree_loop_interchange::valid_data_dependences): Properly
guard all dependence checks with DDR_REVERSED_P or its
inverse.

* gcc.dg/torture/pr101173.c: New testcase.

(cherry picked from commit e46ec6e243c704f0858d16af380a7d9c36fc4244)

[Bug tree-optimization/101280] [12 Regression] TSVC s231 slower with -Ofast -march=znver1 since r12-1836-g0ad9d88a3d7170b3

2021-07-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101280

--- Comment #8 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:e46ec6e243c704f0858d16af380a7d9c36fc4244

commit r11-8699-ge46ec6e243c704f0858d16af380a7d9c36fc4244
Author: Richard Biener 
Date:   Wed Jun 23 09:59:28 2021 +0200

tree-optimization/101173 - fix interchange dependence checking

This adjusts the loop interchange dependence checking to properly
guard all dependence checks with DDR_REVERSED_P or its inverse.

2021-07-07  Richard Biener  

PR tree-optimization/101173
PR tree-optimization/101280
* gimple-loop-interchange.cc
(tree_loop_interchange::valid_data_dependences): Properly
guard all dependence checks with DDR_REVERSED_P or its
inverse.

* gcc.dg/torture/pr101173.c: New testcase.

[Bug tree-optimization/101280] [12 Regression] TSVC s231 slower with -Ofast -march=znver1 since r12-1836-g0ad9d88a3d7170b3

2021-07-02 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101280

--- Comment #7 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:c4804ff24401733e3b470a49b8a6c9306e6cfcfa

commit r12-1973-gc4804ff24401733e3b470a49b8a6c9306e6cfcfa
Author: Richard Biener 
Date:   Fri Jul 2 08:51:43 2021 +0200

tree-optimization/101280 - re-revise interchange fix for PR101173

The following fixes up the revision of the original fix for PR101173
to properly guard all dependence checks with DDR_REVERSED_P or its
inverse.

2021-07-01  Richard Biener  

PR tree-optimization/101280
PR tree-optimization/101173
* gimple-loop-interchange.cc
(tree_loop_interchange::valid_data_dependences): Properly
guard all dependence checks with DDR_REVERSED_P or its
inverse.

[Bug tree-optimization/101280] [12 Regression] TSVC s231 slower with -Ofast -march=znver1 since r12-1836-g0ad9d88a3d7170b3

2021-07-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101280

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Richard Biener  ---
Should be fixed.

[Bug tree-optimization/101280] [12 Regression] TSVC s231 slower with -Ofast -march=znver1 since r12-1836-g0ad9d88a3d7170b3

2021-07-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101280

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:0a77c07b9b3fe83679358c3ef57721e09e2ad5fb

commit r12-1954-g0a77c07b9b3fe83679358c3ef57721e09e2ad5fb
Author: Richard Biener 
Date:   Thu Jul 1 12:49:45 2021 +0200

tree-optimization/101280 - revise interchange fix for PR101173

The following revises the original fix for PR101173 to correctly
check for a reversed dependence rather than disallowing a zero
distance.  It also adds a check from TSVC which asks for this
kind of interchange (but with a valid dependence).

2021-07-01  Richard Biener  

PR tree-optimization/101280
PR tree-optimization/101173
* gimple-loop-interchange.cc
(tree_loop_interchange::valid_data_dependences): Revert
previous change and instead correctly handle DDR_REVERSED_P
dependence.

* gcc.dg/tree-ssa/loop-interchange-16.c: New testcase.

[Bug tree-optimization/101280] [12 Regression] TSVC s231 slower with -Ofast -march=znver1 since r12-1836-g0ad9d88a3d7170b3

2021-07-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101280

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Richard Biener  ---
Testing fix.

[Bug tree-optimization/101280] [12 Regression] TSVC s231 slower with -Ofast -march=znver1 since r12-1836-g0ad9d88a3d7170b3

2021-07-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101280

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2021-07-01

--- Comment #3 from Richard Biener  ---
void dummy (double *, double *);
#define LEN_2D 32
double aa[LEN_2D][LEN_2D], bb[LEN_2D][LEN_2D];
double s231(int iterations)
{
//loop interchange
//loop with data dependency
for (int nl = 0; nl < 100*(iterations/LEN_2D); nl++) {
for (int i = 0; i < LEN_2D; ++i) {
for (int j = 1; j < LEN_2D; j++) {
aa[j][i] = aa[j - 1][i] + bb[j][i];
}
}
dummy(aa[0],bb[0]);
}
}

compiles and

> gcc-11 t.c -O3 -fopt-info-loop -S
t.c:9:27: optimized: loops interchanged in loop nest
t.c:10:31: optimized: loop vectorized using 16 byte vectors
t.c:4:8: optimized: loop with 15 iterations completely unrolled (header
execution count 33608120)

while trunk only vectorizes.

[Bug tree-optimization/101280] [12 Regression] TSVC s231 slower with -Ofast -march=znver1 since r12-1836-g0ad9d88a3d7170b3

2021-07-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101280

--- Comment #2 from Richard Biener  ---
Seems to be this one:

real_t s231(struct args_t * func_args)
{
//loop interchange
//loop with data dependency

initialise_arrays(__func__);
gettimeofday(_args->t1, NULL);

for (int nl = 0; nl < 100*(iterations/LEN_2D); nl++) {
for (int i = 0; i < LEN_2D; ++i) {
for (int j = 1; j < LEN_2D; j++) {
aa[j][i] = aa[j - 1][i] + bb[j][i];
}
}
dummy(a, b, c, d, e, aa, bb, cc, 0.);
}

gettimeofday(_args->t2, NULL);
return calc_checksum(__func__);
}

[Bug tree-optimization/101280] [12 Regression] TSVC s231 slower with -Ofast -march=znver1 since r12-1836-g0ad9d88a3d7170b3

2021-07-01 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101280

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
   Target Milestone|--- |12.0
 Blocks||101173

--- Comment #1 from Richard Biener  ---
Can you paste the loop kernel?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101173
[Bug 101173] [9/10/11 Regression] wrong code at -O3 on x86_64-linux-gnu