Re: [Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
On Thu, 25 Mar 2010, Sebastian Pop wrote: On Wed, Mar 24, 2010 at 16:35, howarth at nitro dot med dot uc dot edu gcc-bugzi...@gcc.gnu.org wrote: Fixed. Please use ftp://gcc.gnu.org/pub/gcc/infrastructure/cloog-ppl-0.15.9.tar.gz Shouldn't the required cloog-ppl version in configure be bumped from 0.15.5 to 0.15.9? Richi what do you think? It's a bit late for that change. You could warn at configure time if a broken but acceptable version is detected, like we do for gmp or mpfr (I don't remember which one). Thanks, Richard.
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #33 from rguenther at suse dot de 2010-03-26 09:58 --- Subject: Re: [4.5 Regression][graphite]-fgraphite-identity miscompiles air.f90 On Thu, 25 Mar 2010, Sebastian Pop wrote: On Wed, Mar 24, 2010 at 16:35, howarth at nitro dot med dot uc dot edu gcc-bugzi...@gcc.gnu.org wrote: Fixed. Please use ftp://gcc.gnu.org/pub/gcc/infrastructure/cloog-ppl-0.15.9.tar.gz Shouldn't the required cloog-ppl version in configure be bumped from 0.15.5 to 0.15.9? Richi what do you think? It's a bit late for that change. You could warn at configure time if a broken but acceptable version is detected, like we do for gmp or mpfr (I don't remember which one). Thanks, Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
Re: [Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
On Wed, Mar 24, 2010 at 16:35, howarth at nitro dot med dot uc dot edu gcc-bugzi...@gcc.gnu.org wrote: Fixed. Please use ftp://gcc.gnu.org/pub/gcc/infrastructure/cloog-ppl-0.15.9.tar.gz Shouldn't the required cloog-ppl version in configure be bumped from 0.15.5 to 0.15.9? Richi what do you think?
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #32 from sebpop at gmail dot com 2010-03-25 17:43 --- Subject: Re: [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90 On Wed, Mar 24, 2010 at 16:35, howarth at nitro dot med dot uc dot edu gcc-bugzi...@gcc.gnu.org wrote: Fixed. Please use ftp://gcc.gnu.org/pub/gcc/infrastructure/cloog-ppl-0.15.9.tar.gz Shouldn't the required cloog-ppl version in configure be bumped from 0.15.5 to 0.15.9? Richi what do you think? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #30 from spop at gcc dot gnu dot org 2010-03-24 20:17 --- Fixed. Please use ftp://gcc.gnu.org/pub/gcc/infrastructure/cloog-ppl-0.15.9.tar.gz -- spop at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #31 from howarth at nitro dot med dot uc dot edu 2010-03-24 21:35 --- (In reply to comment #30) Fixed. Please use ftp://gcc.gnu.org/pub/gcc/infrastructure/cloog-ppl-0.15.9.tar.gz Shouldn't the required cloog-ppl version in configure be bumped from 0.15.5 to 0.15.9? #if CLOOG_VERSION_MAJOR != 0 || CLOOG_VERSION_MINOR != 15 || CLOOG_VERSION_REVISION 5 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #27 from spop at gcc dot gnu dot org 2010-03-23 20:08 --- I just verified that the problem is well in CLooG-PPL. With CLooG-ISL we generate a correct code that looks like this: for (c2=1;c2=D.1554_10-3;c2++) { S1(c2); for (c4=0;c4=c2-1;c4++) { S2(c2,c4); S3(c2,c4); S4(c2,c4); } S2(c2,c2); S4(c2,c2); for (c4=c2+1;c4=D.1554_10-2;c4++) { S2(c2,c4); S3(c2,c4); S4(c2,c4); } S5(c2); S6(c2); } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #28 from spop at gcc dot gnu dot org 2010-03-23 21:24 --- I fixed this problem in CLooG-PPL: the code generated with the new version looks like the code generated by CLooG-ISL: for (c2=1;c2=D.1554_10-3;c2++) { S1(c2); for (c4=0;c4=c2-1;c4++) { S2(c2,c4); S3(c2,c4); S4(c2,c4); } S2(c2,c2); S4(c2,c2); for (c4=c2+1;c4=D.1554_10-2;c4++) { S2(c2,c4); S3(c2,c4); S4(c2,c4); } S5(c2); S6(c2); } I will bootstrap and test the graphite branch with the modified version of CLooG-PPL and then if everything looks good, I will prepare a new version CLooG-PPL-0.15.9 and I will upload it on ftp://gcc.gnu.org/pub/gcc/infrastructure/ Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #29 from spop at gcc dot gnu dot org 2010-03-23 21:27 --- Also note that with the fix in CLooG-PPL, gfortran -O2 -fgraphite-identity air.f90 -o air ./air passes without error. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
Re: [Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
On Sat, Mar 20, 2010 at 05:45, dominiq at lps dot ens dot fr wrote: Do you understand why graphite does not detect that the loop for [scat_1+1, T_10-2] depends on the one for [0, scat_1-1]? Graphite does know this, but it does not ask CLooG to generate [0, scat_1-1] after [scat_1+1, T_10-2], however CLooG does generate it, so I am thinking that this is a problem in CLooG. Second question why does graphite exchange the order of the split loops? CLooG does that.
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #26 from sebpop at gmail dot com 2010-03-21 16:28 --- Subject: Re: [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90 On Sat, Mar 20, 2010 at 05:45, dominiq at lps dot ens dot fr wrote: Do you understand why graphite does not detect that the loop for [scat_1+1, T_10-2] depends on the one for [0, scat_1-1]? Graphite does know this, but it does not ask CLooG to generate [0, scat_1-1] after [scat_1+1, T_10-2], however CLooG does generate it, so I am thinking that this is a problem in CLooG. Second question why does graphite exchange the order of the split loops? CLooG does that. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #25 from dominiq at lps dot ens dot fr 2010-03-20 10:45 --- We are executing the range [scat_1+1, T_10-2] before executing the range [0, scat_1-1]. Do you understand why graphite does not detect that the loop for [scat_1+1, T_10-2] depends on the one for [0, scat_1-1]? Second question why does graphite exchange the order of the split loops? Note also that in the original loop handling the loop for [0, scat_1-1] is tricky when scat_1==1 (fctr = fctr1*fctr2 and fctr2 = -o*fctr2 must be handled correctly). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #24 from dominiq at lps dot ens dot fr 2010-03-14 11:40 --- We are executing the range [scat_1+1, T_10-2] before executing the range [0, scat_1-1]. This would be OK if fctr had been initialized with the right value. Failing to do so leads to an early NAN. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #22 from dominiq at lps dot ens dot fr 2010-03-09 13:32 --- The following variant of spectop.f90 is miscompiled with '-fgraphite-identity -O3': SUBROUTINE SPECTOP(Dr,N) IMPLICIT REAL*8(A-H,o-Z) DIMENSION d1(0:32,0:32) , Dr(0:32,0:32) , x(0:32) REAL*8 Dr ! ! PROGRAM TO COMPUTE THE CHEBYSHEV SPECTRAL OPERATOR ! ang = DBLE(1) s = DBLE(6) o = DBLE(1) t = DBLE(2) pi = t*DASIN(ang) DO i = 0 , N x(i) = DCOS(pi*DBLE(i)/DBLE(N)) ENDDO ! ! IF J=K ! DO j = 1 , N - 1 d1(j,j) = -x(j)/(t*(o-x(j)**2)) ENDDO d1(0,0) = (t*DBLE(N)**2+o)/s d1(N,N) = -d1(0,0) ! ! IF J.NE.K ! fctr = -t DO j = 1 , N-1 d1(0,j) = fctr/(x(0)-x(j)) fctr = -fctr ENDDO d1(0,N) = fctr/(t*(x(0)-x(N))) fctr1 = -1.0D0 DO k = 1 , N-1 d1(k,0) = fctr1/(t*(x(k)-x(0))) fctr = -fctr1 DO j = 1 , N-1 if (j.ne.k) d1(k,j) = fctr/(x(k)-x(j)) fctr = -fctr ENDDO d1(k,N) = fctr/(t*(x(k)-x(N))) fctr1 = -fctr1 ENDDO d1(N,0) = fctr1/(x(N)-x(0)) fctr = -t*fctr1 DO j = 1 , N-1 d1(N,j) = fctr/(x(N)-x(j)) fctr = -fctr ENDDO DO k = 0 , N DO j = 0 , N Dr(k,j) = d1(N-k,N-j) ENDDO ENDDO CONTINUE END but is correctly compiled if DO j = 1 , N-1 if (j.ne.k) d1(k,j) = fctr/(x(k)-x(j)) fctr = -fctr ENDDO is replaced with DO j = 1 , k-1 d1(k,j) = fctr/(x(k)-x(j)) fctr = -fctr ENDDO fctr = -fctr DO j = k+1 , N-1 d1(k,j) = fctr/(x(k)-x(j)) fctr = -fctr ENDDO -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #23 from spop at gcc dot gnu dot org 2010-03-09 22:31 --- Thanks for reducing this testcase. On the further reduced kernel: SUBROUTINE SPECTOP(Dr,N) IMPLICIT REAL*8(A-H,o-Z) DIMENSION d1(0:32,0:32) , Dr(0:32,0:32) , x(0:32) REAL*8 Dr DO k = 1 , N-1 d1(k,0) = fctr1/(t*(x(k)-x(0))) fctr = -fctr1 DO j = 1 , N-1 if (j.ne.k) d1(k,j) = fctr/(x(k)-x(j)) fctr = -fctr ENDDO d1(k,N) = fctr/(t*(x(k)-x(N))) fctr1 = -fctr1 ENDDO Dr(k,j) = d1(N-k,N-j) CONTINUE END We have this CLAST generated by CLooG: if (T_10 = 3) { S4(0) ; S5(0,0) ; S7(0,0) ; for (scat_3=1;scat_3=T_10-2;scat_3++) { S5(0,scat_3) ; S6(0,scat_3) ; S7(0,scat_3) ; } S9(0) ; S14(0) ; } for (scat_1=1;scat_1=T_10-3;scat_1++) { S4(scat_1) ; S5(scat_1,scat_1) ; S7(scat_1,scat_1) ; for (scat_3=scat_1+1;scat_3=T_10-2;scat_3++) { S5(scat_1,scat_3) ; S6(scat_1,scat_3) ; S7(scat_1,scat_3) ; } for (scat_3=0;scat_3=scat_1-1;scat_3++) { S5(scat_1,scat_3) ; S6(scat_1,scat_3) ; S7(scat_1,scat_3) ; } S9(scat_1) ; S14(scat_1) ; } if (T_10 = 3) { S4(T_10-2) ; S5(T_10-2,T_10-2) ; S7(T_10-2,T_10-2) ; for (scat_3=0;scat_3=T_10-3;scat_3++) { S5(T_10-2,scat_3) ; S6(T_10-2,scat_3) ; S7(T_10-2,scat_3) ; } S9(T_10-2) ; S14(T_10-2) ; } if (T_10 == 2) { S4(0) ; S5(0,0) ; S7(0,0) ; S9(0) ; S14(0) ; } Where T_10 stands for the parameter N, S4 stands for d1(k,0) = fctr1/(t*(x(k)-x(0))) fctr = -fctr1 S5 stands for if (j.ne.k) S6 stands for d1(k,j) = fctr/(x(k)-x(j)) S7 stands for fctr = -fctr. S14 stands for d1(k,N) = fctr/(t*(x(k)-x(N))) fctr1 = -fctr1 So the error seems to be that we are splitting the index scat_3 into two ranges that are not ordered in the same way as in the original loop nest: for (scat_1=1;scat_1=T_10-3;scat_1++) { S4(scat_1) ; S5(scat_1,scat_1) ; S7(scat_1,scat_1) ; for (scat_3=scat_1+1;scat_3=T_10-2;scat_3++) { S5(scat_1,scat_3) ; S6(scat_1,scat_3) ; S7(scat_1,scat_3) ; } for (scat_3=0;scat_3=scat_1-1;scat_3++) { S5(scat_1,scat_3) ; S6(scat_1,scat_3) ; S7(scat_1,scat_3) ; } S9(scat_1) ; S14(scat_1) ; } We are executing the range [scat_1+1, T_10-2] before executing the range [0, scat_1-1]. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #14 from howarth at nitro dot med dot uc dot edu 2010-03-08 23:09 --- Is this issue to be fixed for gcc 4.5? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #15 from spop at gcc dot gnu dot org 2010-03-08 23:16 --- Yes. I think it is important to understand what is miscompiled with the graphite identity. I will try to reduce this. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #16 from howarth at nitro dot med dot uc dot edu 2010-03-09 02:26 --- The miscompiled code in air.f90 is the subroutine SPECTOP. If I pull that subroutine out into a separate file and compile it at -O3 without -fgraphite-identity, the remainder of the code can be compiled with -fgraphite-identity at -O3 and the resulting binary runs fine on x86_64-apple-darwin10. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #17 from howarth at nitro dot med dot uc dot edu 2010-03-09 02:34 --- Created an attachment (id=20049) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20049action=view) assembly for spectop subroutine compiled at -O2 on x86_64-apple-darwin10 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #18 from howarth at nitro dot med dot uc dot edu 2010-03-09 02:35 --- Created an attachment (id=20050) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20050action=view) assembly for spectop subroutine compiled at -graphite-identity -O2 on x86_64-apple-darwin10 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #19 from howarth at nitro dot med dot uc dot edu 2010-03-09 02:37 --- Created an attachment (id=20051) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20051action=view) diff between assembly for spectop subroutine at -O2 without and with -fgraphite-identity -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #20 from howarth at nitro dot med dot uc dot edu 2010-03-09 03:45 --- The offending optimization for the spectop subroutine at -O2 -fgraphite-identity appears to be -fstrict-overflow. I can compile... gfortran -fgraphite-identity -O3 -fno-strict-overflow -c spectop.f90 gfortran -O3 -fgraphite-identity air_sans_spectop.f90 spectop.o and the resulting air benchmark runs fine. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #21 from howarth at nitro dot med dot uc dot edu 2010-03-09 03:48 --- Interestingly, I get... gfortran -fgraphite-identity -O3 -Wstrict-overflow=5 -c spectop.f90 spectop.f90: In function spectop: spectop.f90:5:0: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 spectop.f90:5:0: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 spectop.f90:5:0: warning: assuming signed overflow does not occur when reducing constant in comparison spectop.f90:5:0: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 spectop.f90:5:0: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 spectop.f90:5:0: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 spectop.f90:5:0: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 spectop.f90:5:0: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 whereas without -fgraphite-identity, I get... gfortran -O3 -Wstrict-overflow=5 -c spectop.f90 spectop.f90: In function spectop: spectop.f90:23:0: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181