[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #28 from bonzini at gnu dot org 2009-02-06 07:33 --- Subject: Bug 35659 Author: bonzini Date: Fri Feb 6 07:33:05 2009 New Revision: 143980 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=143980 Log: 2009-02-06 Paolo Bonzini PR tree-optimization/35659 * tree-ssa-sccvn.c (vn_constant_eq, vn_reference_eq, vn_nary_op_eq vn_phi_eq): Shortcut if hashcode does not match. (vn_reference_op_compute_hash): Do not call iterative_hash_expr for NULL operands. * tree-ssa-pre.c (pre_expr_hash): Look at hashcode if available, and avoid iterative_hash_expr. (FOR_EACH_VALUE_ID_IN_SET): New. (value_id_compare): Remove. (sorted_array_from_bitmap_set): Use FOR_EACH_VALUE_ID_IN_SET to sort expressions by value id. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-pre.c trunk/gcc/tree-ssa-sccvn.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #27 from mkuvyrkov at gcc dot gnu dot org 2008-08-06 06:38 --- Should be fixed on both trunk and 4_3-branch. -- mkuvyrkov at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #26 from mkuvyrkov at gcc dot gnu dot org 2008-08-06 06:36 --- Subject: Bug 35659 Author: mkuvyrkov Date: Wed Aug 6 06:34:18 2008 New Revision: 138762 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138762 Log: Backport from mainline: 2008-08-06 Maxim Kuvyrkov <[EMAIL PROTECTED]> PR target/35659 * haifa-sched.c (sched_insn_is_legitimate_for_speculation_p): Move ... * sched-deps.c (sched_insn_is_legitimate_for_speculation_p): ... here. Don't allow predicated instructions for data speculation. * sched-int.h (sched_insn_is_legitimate_for_speculation_p): Move declaration. Modified: branches/gcc-4_3-branch/gcc/ChangeLog branches/gcc-4_3-branch/gcc/haifa-sched.c branches/gcc-4_3-branch/gcc/sched-deps.c branches/gcc-4_3-branch/gcc/sched-int.h -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #25 from mkuvyrkov at gcc dot gnu dot org 2008-08-06 06:25 --- Subject: Bug 35659 Author: mkuvyrkov Date: Wed Aug 6 06:23:47 2008 New Revision: 138759 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138759 Log: PR target/35659 * haifa-sched.c (sched_insn_is_legitimate_for_speculation_p): Move ... * sched-deps.c (sched_insn_is_legitimate_for_speculation_p): ... here. Don't allow predicated instructions for data speculation. * sched-int.h (sched_insn_is_legitimate_for_speculation_p): Move declaration. Modified: trunk/gcc/ChangeLog trunk/gcc/haifa-sched.c trunk/gcc/sched-deps.c trunk/gcc/sched-int.h -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #24 from sje at cup dot hp dot com 2008-08-04 17:34 --- I bootstrapped the 3 patches on mainline and 4.3 branch and verified that they fix the problem that is reported on the 4.3 branch with the patches. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #23 from mkuvyrkov at gcc dot gnu dot org 2008-08-02 18:15 --- (In reply to comment #22) > Maxim, have you had time to look at this bug? Given that it is generating bad > code and is in 4.3.0 and 4.3.1 I was wondering if it will be fixed for 4.3.2. Sorry for the delay. I posted the fix at http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00112.html. I would appreciate if someone could test the cumulative patch of three fixes for ia64 speculation support or provide a single-file executable testcase for this bug. Here are links to two other bugfixes for ia64 speculation support: http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00110.html http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00111.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #22 from sje at cup dot hp dot com 2008-08-01 23:28 --- Maxim, have you had time to look at this bug? Given that it is generating bad code and is in 4.3.0 and 4.3.1 I was wondering if it will be fixed for 4.3.2. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #21 from mkuvyrkov at gcc dot gnu dot org 2008-06-26 09:23 --- Assign to self -- mkuvyrkov at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |mkuvyrkov at gcc dot gnu dot |dot org |org Status|NEW |ASSIGNED Last reconfirmed|2008-06-24 15:04:50 |2008-06-26 09:23:37 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #20 from maxim at codesourcery dot com 2008-06-26 09:21 --- Subject: Re: [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 jakub at gcc dot gnu dot org wrote: > --- Comment #19 from jakub at gcc dot gnu dot org 2008-06-26 08:41 > --- > To be more precise, the problem is in speculating conditional store: > ld4.a r18 = [r44] > st4 [r59] = r14, -60 > ... > cmp4.ge p6, p7 = r22, r18 > (p7) ld4 r14 = [r62] > ... > (p7) st4 [r77] = r14 > chk.a.clr r18, .L69 > .L70: > ... > .L69: > ld4 r18 = [r44] > ;; > cmp4.ge p6, p7 = r22, r18 > ;; > (p7) ld4 r14 = [r62] > ;; > (p7) st4 [r77] = r14 > br .L70 > > Now, ld4.a returns the stale value in r18 (0x20202020), $r22 is 2 and so in > the > code before chk.a.clr p7 is 1, so [r62] is loaded into r14 and stored into > [r77] > (i.e. MR = M in IF (K.GT.M1) MR = M is executed). > Then chk.a.clr realizes the [r44] memory changed, so branches to .L69 to do > the > speculated stuff again. This time r18 is 1, so p7 will be 0 and MR = M is not > done. But this really doesn't and can't undo the st4 [r77] = r14 store that > already happened before chk.a.clr and wasn't supposed to happen. Oh, now I see the problem. I need some time to check what the best fix would be. Probably, the fix will be to exclude predicated instructions from speculation set, but I'd like to consider alternatives. Thanks, Maxim -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #19 from jakub at gcc dot gnu dot org 2008-06-26 08:41 --- To be more precise, the problem is in speculating conditional store: ld4.a r18 = [r44] st4 [r59] = r14, -60 ... cmp4.ge p6, p7 = r22, r18 (p7) ld4 r14 = [r62] ... (p7) st4 [r77] = r14 chk.a.clr r18, .L69 .L70: ... .L69: ld4 r18 = [r44] ;; cmp4.ge p6, p7 = r22, r18 ;; (p7) ld4 r14 = [r62] ;; (p7) st4 [r77] = r14 br .L70 Now, ld4.a returns the stale value in r18 (0x20202020), $r22 is 2 and so in the code before chk.a.clr p7 is 1, so [r62] is loaded into r14 and stored into [r77] (i.e. MR = M in IF (K.GT.M1) MR = M is executed). Then chk.a.clr realizes the [r44] memory changed, so branches to .L69 to do the speculated stuff again. This time r18 is 1, so p7 will be 0 and MR = M is not done. But this really doesn't and can't undo the st4 [r77] = r14 store that already happened before chk.a.clr and wasn't supposed to happen. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #18 from jakub at gcc dot gnu dot org 2008-06-26 08:01 --- You can reproduce it with a cross compiler too. Just use 4.3 branch (IMHO unrelated changes on the trunk made this bug latent) and don't combine everything into the same file. The #c12 program+routines have to be in one file, #c13 in a different one. The latter is miscompiled, if you are on ia64, you can compile the former with any optimization options, the latter with -O2, link, run. If you have just a cross compiler, just compile the #c13 source and inspect the assembly (look at ld4.a: addl r41 = @ltoff(slate_#), r1 ... adds r14 = 60, r41 ... .mmi mov r44 = r14 mov r50 = r15 mov r59 = r14 ... ! Instructions that don't modify r44 nor r59, no labels .mmi ld4.a r18 = [r44] ! *r44 aka slate_.k is uninitialized here, 0x20202020 st4 [r59] = r14, -60 ! This stores 1 into slate_.k adds r16 = 48, r41 and look at the *.compgotos dump that right before scheduling that slate_.k = 1 preceeded the prephitmp.78 (== r18) = slate_.k load, while after scheduling it is the other way around. The alias set looks correct for both MEMs (4) and MEM_EXPR/MEM_OFFSET etc. too, so IMHO if alias.c was asked if the two MEMs can overlap, it would certainly say so. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #17 from doko at cs dot tu-berlin dot de 2008-06-25 22:16 --- Subject: Re: [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 mkuvyrkov at gcc dot gnu dot org writes: > Anyway, can you help me reproduce the issue, so I can take a closer look? please email me a ssh key, if access to a Debian machine would help. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #16 from mkuvyrkov at gcc dot gnu dot org 2008-06-25 21:33 --- I can't reproduce the error with today mainline. When I put in one file 'PROGRAM PR35659' and 'SUBROUTINE TLSC (A,B,AUX,IPIV,EPS,X)' and compile it with any optimization level I get the same "STOP 0" message. Am I doing something wrong? I can't spot the problem in the dumps you posted at debian.org. tlsc.s has a single example of data speculation which seems to be fine. Scheduler speculatively moves ld4.a r18 = [r44] before st4 [r59] = r14, -60 and then also speculates several uses of r18: cmp4.ge p6, p7 = r22, r18 ... (p7) ld4 r14 = [r62] ... (p7) st4 [r77] = r14 then it checks the speculation: chk.a.clr r18, .L69 and recovers if speculation failed: .L69: .mmi ld4 r18 = [r44] ;; cmp4.ge p6, p7 = r22, r18 nop 0 ;; .mmi nop 0 (p7) ld4 r14 = [r62] nop 0 ;; .mib (p7) st4 [r77] = r14 nop 0 br .L70 Anyway, can you help me reproduce the issue, so I can take a closer look? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #15 from jakub at gcc dot gnu dot org 2008-06-25 11:32 --- Wrong-code bug on secondary arch. -- jakub at gcc dot gnu dot org changed: What|Removed |Added Priority|P3 |P2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #14 from jakub at gcc dot gnu dot org 2008-06-25 11:31 --- I have no idea why is speculation even attempted here (it doesn't make any sense, the pointer is surely non-NULL, it points to a global variable), and apparently nothing checks whether it is safe to move over the speculative load over the store (at least, I've put a breakpoint on nonoverlapping_memrefs_p and {{,canon_}true,anti,output}_dependence and none of them hit with any MEMs with r44 or POST_MODIFY r59, -60. Maxim, speculation is your baby, could you please have a look? -- jakub at gcc dot gnu dot org changed: What|Removed |Added CC||mkuvyrkov at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #13 from jakub at gcc dot gnu dot org 2008-06-25 10:20 --- And the miscompiled tlsc.f inline (compile with just -O2): SUBROUTINE TLSC (A,B,AUX,IPIV,EPS,X) COMMON /TLSDIM/ M1,M,N,L,IER COMMON /SLATE/ BETA,H,I,IB,IB1,ID,ID1,IEND,II,IST,J,JA,JB,JK + ,JST,K,KPIV,KR,KST,KT,K1,LV,MR,M11,NK,NR,PIV,PIVT + ,SIG,DUM(11) DIMENSION A(*), AUX(*), B(*), IPIV(*), X(*) IF (N.GT.M.OR.M1.GT.N) GO TO 90 K1 = MAX (N,L) IER = 1 DO 5 K=1,N 5 IPIV(K) = K IST = - N JB = 1 - L M11 = M1 + 1 MR = M1 DO 50K=1,N IF (K.GT.M1) MR = M IST = IST + N + 1 JB = JB + L LV = MR - K + 1 PIV = 0. ID = IST - N DO 20J=K,N IF (K.EQ.1 .OR. K.EQ.M11)GO TO 10 PIVT = AUX(J) - A(ID)*A(ID) GO TO 15 10 I = ID + N IF (LV .EQ. 1) GO TO 12 CALL TLSMSQ (A(I),N,LV,PIVT) GO TO 15 12 PIVT= A(I)*A(I) 15 AUX(J) = PIVT ID = ID + 1 IF (PIVT*EPS.LE.PIV) GO TO 20 PIV = PIVT KPIV = J 20 CONTINUE I = KPIV - K IF (I.LE.0) GO TO 25 H = AUX(K) AUX(K) = AUX(KPIV) AUX(KPIV) = H ID = IST + I NR = M - K + 1 CALL TLSWOP (A(IST),A(ID),N,NR) 25 CALL TLUK (A(IST),N,LV,SIG,BETA) IF (LV.EQ.0) GO TO 90 J = K1 + K AUX(J)=-SIG IF (K.GE.N) GO TO 30 NK = N - K IF (LV.EQ.1) GO TO 27 CALL TLSTEP (A(IST),A(IST+1),N,N,LV,NK,BETA) GO TO30 27 DO 28J=1,NK JST = IST + J 28 A(JST) = A(JST)*(1.-BETA*A(IST)**2) 30 IB = (K-1) * L + 1 IF (LV.EQ.1) GO TO 32 CALL TLSTEP (A(IST),B(IB),N,L,LV,L,BETA) GO TO 34 32 DO 33J=1,L JST = IB + J - 1 33 B(JST) = B(JST)*(1.-BETA*A(IST)**2) 34 IPIV(KPIV) = IPIV(K) IPIV(K) = KPIV IF (K.GT.M1) GO TO 50 DO 45I=M11,M ID1 = IST + (I-K)*N IF (A(ID1).EQ.0) GO TO 45 H = - A(ID1)/SIG A(ID1) = H ID1 = ID1 + 1 ID = IST + 1 DO 35J=1,NK A(ID1) = A(ID1) - H*A(ID) ID1 = ID1 + 1 35 ID = ID + 1 IB1 = 1 + (I-1)*L IB = JB DO 40J=1,L B(IB1) = B(IB1) - H*B(IB) IB1 = IB1 + 1 40 IB = IB + 1 45 CONTINUE 50 CONTINUE IER = N * IER KT = N JK = (N-1) * L K = K1 + N PIV = 1./AUX(K) DO 55K=1,L JK = JK + 1 55 X(JK) = PIV * B(JK) KR = N - 1 IF (KR.LE.0) GO TO 70 JST = KR * (N+1) + 2 DO 65J=1,KR JST = JST - N - 1 IEND= (KR-J+1) * N K = K1 + KR - J + 1 PIV = 1./AUX(K) KST = K-K1 ID = IPIV(KST)-KST KST = (KR-J) * L DO 65K=1,L KST = KST + 1 H=B(KST) II = KST DO 60I=JST,IEND II = II + L 60 H = H - A(I) * X(II) II = KST + ID *L X(KST) = X(II) X(II) = PIV * H 65 CONTINUE 70 IST = KT*L DO 80J=1,L IST = IST + 1 H = 0. JA = IST IF (M.LE.KT) GO TO 80 NR = M - KT IF (NR.EQ.1) GO TO 75 CALL TLSMSQ (B(IST),L,NR,H) GO TO 80 75 H = B(IST)*B(IST) 80 AUX(J) = H RETURN 90 IER = -1001 RETURN END The problem is that slate.k (aka prephitmp.78) is read before it is stored, so it has the 0x20202020 value instead of 1. At tlsc.f.198r.compgotos the code still looks correct: (insn 1857 1434 1443 8 tlsc.f:16 (set (reg:SI 14 r14 [1392]) (const_int 1 [0x1])) 4 {*movsi_internal} (expr_list:REG_EQUIV (const_int 1 [0x1]) (nil))) (insn 1443 1857 242 8 tlsc.f:16 (set (mem/s/c:SI (post_modify:DI (reg/f:DI 53 r59 [1516]) (plus:DI (reg/f:DI 53 r59 [1516]) (const_int -60 [0xffc4]))) [2 slate.k+0 S4 A32]) (reg:SI 14 r14 [1392])) 4 {*movsi_internal} (expr_list:REG_INC (reg/f:DI 53 r59 [1516]) (expr_list:REG_EQUAL (const_int 1 [0x1]) (nil ... (insn 197 228 247 8 tlsc.f:17 (set (reg:DI 18 r18) (zero_extend:DI (mem/s/c:SI (reg/f:DI 38 r44 [1517]) [2 slate.k+0 S4 A32]))) 103 {zero_extendsidi2} (expr_list:REG_EQUAL (mem/s/c:SI (const:DI (plus:DI (symbol_ref:DI ("slate_") ) (const_int 60 [0x3c]))) [2 slate.k+0 S4 A32]) (nil))) (insn 247 197 198 8 tlsc.f:22 (set (reg:SI 19 r19 [665]) (minus:SI (reg:SI 20 r20 [orig:560 D.775 ] [560]) (reg:SI 32 r38 [orig:529 prephitmp.68 ] [52
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #12 from jakub at gcc dot gnu dot org 2008-06-24 15:04 --- Even smaller reproducer: PROGRAM PR35659 DIMENSION A(1000), B(1010), AUX(8), IPIV(8), X(16) COMMON /TLSDIM/ M1,M,N,L,IER COMMON /SLATE/ V1,V2,IAR(24),DUM(14) DATA A/0, 1, 0, 0, 1, 0, 2, 0, 1, 0, 0, 0, 1, 0.20003, 1 0.039991, 0.0080038, 1, 0.40006, 0.15996, 2 0.06403, 1, 0.60024, 0.36014, 0.21606, 1, 3 0.80012, 0.63986, 0.51224, 1, 1, 1, 1, 968*0./ DATA B/1, 2, 1, 1.22140002, 1.49179995, 1.82210004, 4 2.22550011, 2.7183001, 0, 0, 1000*0./ M1 = 2 M = 8 N = 4 L = 1 IER = 0 V1 = 0 V2 = 1.40129846e-45 IAR(:) = 538976288 DUM(:) = 1.35631564e-19 CALL TLSC(A,B,AUX,IPIV,1.,X) END SUBROUTINE TLSMSQ (B,L,M,F) DIMENSION B(*) IF (M.NE.2) CALL ABORT STOP 0 END SUBROUTINE TLSWOP (A,AD,N,NR) DIMENSION A(*), AD(*) CALL ABORT END SUBROUTINE TLUK (A,IASEP,NR,SIG,BETA) DIMENSION A(*) CALL ABORT END SUBROUTINE TLSTEP (A,B,IASEP,IBSEP,NR,NC,BETA) DIMENSION A(*), B(*) CALL ABORT END The miscompiled TLSC calls the first TLSMSQ routine with 8.0 rather than 2.0 as the 3rd argument. -- jakub at gcc dot gnu dot org changed: What|Removed |Added Status|WAITING |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2008-06-24 15:04:50 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #11 from jakub at gcc dot gnu dot org 2008-06-24 14:44 --- PROGRAM PR35659 DIMENSION A(1000), B(1010), AUX(8), IPIV(8), X(16) COMMON /TLSDIM/ M1,M,N,L,IER COMMON /SLATE/ V1,V2,IAR(24),DUM(14) DATA A/0, 1, 0, 0, 1, 0, 2, 0, 1, 0, 0, 0, 1, 0.20003, 1 0.039991, 0.0080038, 1, 0.40006, 0.15996, 2 0.06403, 1, 0.60024, 0.36014, 0.21606, 1, 3 0.80012, 0.63986, 0.51224, 1, 1, 1, 1, 968*0./ DATA B/1, 2, 1, 1.22140002, 1.49179995, 1.82210004, 4 2.22550011, 2.7183001, 0, 0, 1000*0./ M1 = 2 M = 8 N = 4 L = 1 IER = 0 V1 = 0 V2 = 1.40129846e-45 IAR(:) = 538976288 DUM(:) = 1.35631564e-19 CALL TLSC(A,B,AUX,IPIV,1.,X) IF (ABS(X(1) - 0.99785352).GE.0.1) CALL ABORT IF (ABS(X(2) - 1.0).GE.0.1) CALL ABORT IF (ABS(X(3) - 0.50107324).GE.0.1) CALL ABORT IF (ABS(X(4) - 0.21670136).GE.0.1) CALL ABORT END SUBROUTINE TLSMSQ (B,L,M,F) COMMON /SLATE/ DUM(38),I,JB DIMENSION B(*) F = 0. JB = 1 DO 10I=1,M F = F + B(JB)*B(JB) 10 JB = JB + L RETURN END SUBROUTINE TLSWOP (A,AD,N,NR) COMMON /SLATE/ DUM(37),H,I,JA DIMENSION A(*), AD(*) JA = 1 DO 10I=1,NR H = A(JA) A(JA) = AD(JA) AD(JA) = H 10 JA = JA + N RETURN END SUBROUTINE TLUK (A,IASEP,NR,SIG,BETA) COMMON /SLATE/ DUM(37),I,JA,LL DIMENSION A(*) SIG= 0. JA = 1 LL = 0 DO 10I=1,NR IF (A(JA).EQ.0.) GO TO 10 LL = I SIG= SIG + A(JA)* A(JA) 10 JA = JA + IASEP NR = LL IF (NR.EQ.0) RETURN SIG = SIGN (SQRT (SIG),A(1)) BETA = A(1) + SIG A(1) = BETA BETA = 1. / (SIG * BETA) RETURN END SUBROUTINE TLSTEP (A,B,IASEP,IBSEP,NR,NC,BETA) COMMON /SLATE/ DUM(34),H,I,IB,J,JA,JB DIMENSION A(*), B(*) IB = 0 DO 30J=1,NC IB = IB + 1 H = 0. JA = 1 JB = IB DO 10I=1,NR H = H + A(JA) * B(JB) JA = JA +IASEP 10 JB = JB + IBSEP H = H * BETA JA = 1 JB = IB DO 20I=1,NR B(JB) = B(JB) - A(JA) * H JA = JA +IASEP 20 JB = JB + IBSEP 30 CONTINUE RETURN END together with tlsc.f can work as a testcase, and reproduces the problem with current 4.3 branch. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #9 from rguenth at gcc dot gnu dot org 2008-06-04 08:59 --- The only code-generation changing part of the patch is (insert_fake_stores): Handle all component-ref style stores in addition to INDIRECT_REF. Also handle complex types. but that only fixes a missed-optimization, so it probably makes the problem latent on the trunk. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #8 from sje at cup dot hp dot com 2008-06-03 17:59 --- I looked at this bug and I can reproduce it using the precompiled archives from the link. I have not tried to get the CERN sources to create a small 'real' test case. I noticed that the bug does not happen on ToT and found that the test started working correctly with version 133081. The patch submitted in this version is for PR 34677 which claims to be a missed optimization issue as opposed to a codegen bug fix so I am not sure if the bug is really fixed by that change or if that change just masks the problem like -funroll-loops does. 2008-03-10 Richard Guenther <[EMAIL PROTECTED]> PR tree-optimization/34677 * tree-ssa-pre.c (modify_expr_node_pool): Remove. (poolify_tree): Likewise. (modify_expr_template): Likewise. (poolify_modify_stmt): Likewise. (insert_fake_stores): Handle all component-ref style stores in addition to INDIRECT_REF. Also handle complex types. Do not poolify the inserted load. (realify_fake_stores): Do not rebuild the tree but only make it a SSA_NAME copy. (init_pre): Remove initialzation of modify_expr_template. Do not allocate modify_expr_node_pool. (fini_pre): Do not free modify_expr_node_pool. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #7 from mmitchel at gcc dot gnu dot org 2008-05-23 01:29 --- I agree on both points: (1) I should not have marked this P5, and (2) we do not yet have a test case. Marked as "WAITING". -- mmitchel at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |WAITING http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #6 from rguenth at gcc dot gnu dot org 2008-05-21 10:39 --- Mark, you made this P5, but ia64-linux is a secondary platform. P3 again to get it on the radar. But not higher priority because we don't have exactly what I would call a testcase. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added CC||mmitchel at gcc dot gnu dot ||org Known to fail||4.3.0 Priority|P5 |P3 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #5 from kmccarty at debian dot org 2008-05-21 03:50 --- (In reply to comment #4) > I wonder if this is related to PR target/35695, the floating point division > bug > that Jim Wilson fixed. Could you try it with ToT or with the latest 4.3 > branch, both of which have Jim's fix in them. I tried with the gcc-4_3-branch from Subversion today, but it doesn't change things: plain -O2 fails but both -O2 -funroll-loops and -O1 succeed on the test case. For the record: (sid)[EMAIL PROTECTED]:~$ ~/gcc-4.3-branch/bin/gfortran -v Using built-in specs. Target: ia64-unknown-linux-gnu Configured with: ./configure --enable-languages=c,fortran --prefix=/home/kmccarty/gcc-4.3-branch --enable-shared --with-mpfr=/home/kmccarty/gcc-4.3-branch Thread model: posix gcc version 4.3.1 20080521 (prerelease) (GCC) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #4 from sje at cup dot hp dot com 2008-05-20 21:01 --- I wonder if this is related to PR target/35695, the floating point division bug that Jim Wilson fixed. Could you try it with ToT or with the latest 4.3 branch, both of which have Jim's fix in them. -- sje at cup dot hp dot com changed: What|Removed |Added CC||sje at cup dot hp dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
-- mmitchel at gcc dot gnu dot org changed: What|Removed |Added Priority|P3 |P5 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659
[Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
--- Comment #3 from rguenth at gcc dot gnu dot org 2008-03-24 17:32 --- Possibly a scheduling issue. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Known to work||4.2.3 Summary|Miscompiled code with -O2 |[4.3/4.4 Regression] |(but not with -O2 -funroll- |Miscompiled code with -O2 |loops) on ia64 |(but not with -O2 -funroll- ||loops) on ia64 Target Milestone|--- |4.3.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659