[Bug rtl-optimization/103336] New: [arm64] operations on long double generate calls to libgcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103336 Bug ID: 103336 Summary: [arm64] operations on long double generate calls to libgcc Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sebpop at gmail dot com Target Milestone: --- On testsuite/gcc.target/i386/long-double-64-1.c, gcc produces a call to __multf3 on arm64 and the test checks that such a call is not generated on x86_64. $ cat a.c long double foo (long double x) { return x * x; } On arm64: $ gcc -O2 -S -o- a.c foo: stp x29, x30, [sp, -16]! mov v1.16b, v0.16b mov x29, sp bl __multf3 ldp x29, x30, [sp], 16 ret On x64: $ gcc -O2 -S -o- a.c foo: endbr64 fldt8(%rsp) fmul%st(0), %st ret Why GCC does not generate an fmul on arm64 instead of calling libgcc? There is a related performance issue in libGeos: https://github.com/libgeos/geos/issues/509
[Bug middle-end/50335] ICE in psct_dynamic_dim, at graphite-poly.h:659
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50335 --- Comment #10 from sebpop at gmail dot com sebpop at gmail dot com 2011-12-08 04:08:14 UTC --- On Wed, Dec 7, 2011 at 18:06, maxim_kuvyrkov at mentor dot com gcc-bugzi...@gcc.gnu.org wrote: Do I understand correctly that you are suggesting effectively disabling loop flattening completely? Yes, that would fix this problem and other potential bugs related to this implementation of flattening.
[Bug middle-end/50335] ICE in psct_dynamic_dim, at graphite-poly.h:659
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50335 --- Comment #8 from sebpop at gmail dot com sebpop at gmail dot com 2011-12-06 18:40:43 UTC --- Hi Maxim, Thanks for pointing me to this problem. I would recommend that you disable the code in loop flattening by early returning false in flatten_all_loops. I'm pre-approving you to commit such a patch (as I am not allowed to commit anything to gcc due to my new employer's legal policies, although emailing or giving instructions via a phone call to somebody of what to do seems to be fine for now ;-) Vladimir Kargov will take over where I left the patches to make graphite work with ISL and once these are committed to gcc, we will be able to implement loop flattening with ISL, by calling isl_map_flatten_range. Sebastian
[Bug middle-end/49938] [4.7 regression] ICE in interpret_loop_phi, at tree-scalar-evolution.c:1645
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49938 --- Comment #3 from sebpop at gmail dot com sebpop at gmail dot com 2011-08-02 15:02:30 UTC --- On Tue, Aug 2, 2011 at 04:49, rguenth at gcc dot gnu.org gcc-bugzi...@gcc.gnu.org wrote: What's the problem with dealing with a POLYNOMIAL_CHREC here? Why not simply return chrec_dont_know instead of asserting? That's a reasonable fix.
[Bug tree-optimization/48648] internal compiler error: in translate_clast, at graphite-clast-to-gimple.c:1123
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48648 --- Comment #7 from sebpop at gmail dot com sebpop at gmail dot com 2011-04-18 16:51:23 UTC --- (I read somewhere ppl should be replaced with isl?) Not true: GCC still depends on PPL. The use of cloog-ppl is deprecated: GCC will not use it in the future. Using cloog-isl or cloog-parma is fine.
[Bug tree-optimization/46886] [4.5/4.6 Regression] gfortran.dg/array_constructor_9.f90 FAILs with -ftree-parallelize-loops -fstrict-overflow -fno-tree-ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46886 --- Comment #4 from sebpop at gmail dot com sebpop at gmail dot com 2011-02-04 20:30:46 UTC --- On Tue, Jan 18, 2011 at 11:00, jakub at gcc dot gnu.org gcc-bugzi...@gcc.gnu.org wrote: Seems one extra incorrect iteration is added after GOMP_parallel_end: That extra iteration is added by this call: /* Ensure that the exit condition is the first statement in the loop. */ transform_to_exit_first_loop (loop, reduction_list, nit); /* Moves the exit condition of LOOP to the beginning of its header, and duplicates the part of the last iteration that gets disabled to the exit of the loop. NIT is the number of iterations of the loop (used to initialize the variables in the duplicated part). TODO: the common case is that latch of the loop is empty and immediately follows the loop exit. In this case, it would be better not to copy the body of the loop, but only move the entry of the loop directly before the exit check and increase the number of iterations of the loop by one. This may need some additional preconditioning in case NIT = ~0. REDUCTION_LIST describes the reductions in LOOP. */ static void transform_to_exit_first_loop (struct loop *loop, htab_t reduction_list, tree nit)
[Bug tree-optimization/46194] [4.5/4.6 Regression] gcc.dg/graphite/block-0.c FAILs with -ftree-parallelize-loops
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46194 --- Comment #7 from sebpop at gmail dot com sebpop at gmail dot com 2011-02-03 18:11:09 UTC --- Here is the loop kernel from block-0.c for (i = 0; i N; i++) for (j = 0; j N; j++) a[j] = a[i] + 1; On Fri, Dec 31, 2010 at 06:01, jakub at gcc dot gnu.org gcc-bugzi...@gcc.gnu.org wrote: access_fn_A: {0, +, 1}_1 access_fn_B: {0, +, 1}_2 (subscript iterations_that_access_an_element_twice_in_A: [0 + 1 * x_1] last_conflict: 1000 iterations_that_access_an_element_twice_in_B: [0 + 1 * x_1] I think that this representation of affine functions is wrong: the access in B should read [0 + 0 * x_1 + 1 * x_2] and that would not lead to a wrong conclusion like the following... last_conflict: 1000 (Subscript distance: 0 ) ) inner loop index: 0 loop nest: (1 2 ) distance_vector: 0 0 direction_vector: = = )
[Bug tree-optimization/46194] [4.5/4.6 Regression] gcc.dg/graphite/block-0.c FAILs with -ftree-parallelize-loops
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46194 --- Comment #9 from sebpop at gmail dot com sebpop at gmail dot com 2011-02-03 19:54:36 UTC --- access_fn_A: {0, +, 1}_1 access_fn_B: {0, +, 1}_2 (subscript iterations_that_access_an_element_twice_in_A: [0 + 1 * x_1] last_conflict: 1000 iterations_that_access_an_element_twice_in_B: [0 + 1 * x_1] I think that this representation of affine functions is wrong: the access in B should read [0 + 0 * x_1 + 1 * x_2] and that would not lead to a wrong conclusion like the following... This representation is correct, as it stands for: there exists a dependence for all x_1 from 0 to 1000, that is, there exist a possible overlap that is represented by: {0, +, 1}_1 ([0 + 1 * x_1]) == {0, +, 1}_2 ([0 + 1 * x_1]) last_conflict: 1000 (Subscript distance: 0 ) ) inner loop index: 0 loop nest: (1 2 ) distance_vector: 0 0 direction_vector: = = ) This representation is still wrong, and so I went to see where this was computed, and the bug seems to be in build_classic_dist_vector_1 I will submit a patch to fix this.
[Bug middle-end/45306] ICE (Segmentation fault) while building PyQt with -fgraphite-identity
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45306 --- Comment #9 from sebpop at gmail dot com sebpop at gmail dot com 2011-02-04 07:08:30 UTC --- On Fri, Feb 4, 2011 at 00:27, dirtyepic at gentoo dot org gcc-bugzi...@gcc.gnu.org wrote: I'm guessing that means 4.5 will stay broken? Depends on how difficult it would be to backport the fix. I haven't git-bisect'ed it to know which of the patches fixed it.
[Bug middle-end/40979] induct benchmark 60% slower when compiled with -fgraphite-identity
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40979 --- Comment #16 from sebpop at gmail dot com sebpop at gmail dot com 2011-02-01 16:59:03 UTC --- It's unfortunate that graphite inserts arrays of size 1 instead of scalar (memory) vars. That could be easily fixed. graphite can also use the original data reference to write the reduction in, and that cannot be replaced by a scalar memory variable.
[Bug middle-end/40979] induct benchmark 60% slower when compiled with -fgraphite-identity
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40979 --- Comment #18 from sebpop at gmail dot com sebpop at gmail dot com 2011-02-01 17:22:06 UTC --- On Tue, Feb 1, 2011 at 11:15, rguenth at gcc dot gnu.org gcc-bugzi...@gcc.gnu.org wrote: I'd suggest NEXT_PASS (pass_graphite); { struct opt_pass **p = pass_graphite.pass.sub; NEXT_PASS (pass_graphite_transforms); NEXT_PASS (pass_lim); NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_dce_loop); } That made the loop vectorizable. Thanks Richi!
[Bug bootstrap/45146] Bootstrap broken at -O3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45146 --- Comment #3 from sebpop at gmail dot com sebpop at gmail dot com 2010-12-22 18:36:31 UTC --- We do bootstrap on amd64-linux the graphite branch for every commit, and that would be the equivalent of your patch. Please open a new bug for tracking this issue. Thanks, Sebastian
[Bug tree-optimization/46928] data dependence analysis fails on constant array accesses
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46928 --- Comment #4 from sebpop at gmail dot com sebpop at gmail dot com 2010-12-13 19:05:58 UTC --- The code that is produced looks like this just after loop distribution, i.e., we generate memset zero only by distributing the innermost loop: mad_synth_mute (struct mad_synth * synth) { long unsigned int D.2739; long unsigned int D.2740; long unsigned int D.2741; long unsigned int D.2742; long unsigned int D.2743; struct mad_synth * D.2744; long unsigned int D.2730; long unsigned int D.2731; long unsigned int D.2732; long unsigned int D.2733; unnamed-signed:64 D.2734; unnamed-signed:64 D.2735; unnamed-signed:64 D.2736; long unsigned int D.2737; struct mad_synth * D.2738; long unsigned int D.2721; long unsigned int D.2722; long unsigned int D.2723; long unsigned int D.2724; unnamed-signed:64 D.2725; unnamed-signed:64 D.2726; unnamed-signed:64 D.2727; long unsigned int D.2728; struct mad_synth * D.2729; long unsigned int D.2712; long unsigned int D.2713; long unsigned int D.2714; long unsigned int D.2715; unnamed-signed:64 D.2716; unnamed-signed:64 D.2717; unnamed-signed:64 D.2718; long unsigned int D.2719; struct mad_synth * D.2720; unsigned int pretmp.2; unsigned int v; unsigned int s; unsigned int ch; bb 2: goto bb 10; bb 5: Invalid sum of incoming frequencies 139, should be s_9 = s_29 + 1; if (s_9 != 16) goto bb 6; else goto bb 8; bb 6: bb 7: Invalid sum of outgoing probabilities 12.5% # s_29 = PHI 0(10), s_9(6) D.2712_25 = (long unsigned int) s_29; D.2713_22 = (long unsigned int) ch_28; D.2714_1 = D.2713_22 * 64; D.2715_2 = D.2712_25 + D.2714_1; D.2716_3 = (unnamed-signed:64) D.2715_2; D.2717_26 = D.2716_3 + 48; D.2718_27 = D.2717_26 * 32; D.2719_24 = (long unsigned int) D.2718_27; D.2720_23 = synth_7(D) + D.2719_24; __builtin_memset (D.2720_23, 0, 32); D.2721_13 = (long unsigned int) s_29; D.2722_12 = (long unsigned int) ch_28; D.2723_11 = D.2722_12 * 64; D.2724_6 = D.2721_13 + D.2723_11; D.2725_20 = (unnamed-signed:64) D.2724_6; D.2726_32 = D.2725_20 + 32; D.2727_33 = D.2726_32 * 32; D.2728_34 = (long unsigned int) D.2727_33; D.2729_35 = synth_7(D) + D.2728_34; __builtin_memset (D.2729_35, 0, 32); D.2730_36 = (long unsigned int) s_29; D.2731_37 = (long unsigned int) ch_28; D.2732_38 = D.2731_37 * 64; D.2733_39 = D.2730_36 + D.2732_38; D.2734_40 = (unnamed-signed:64) D.2733_39; D.2735_41 = D.2734_40 + 16; D.2736_42 = D.2735_41 * 32; D.2737_43 = (long unsigned int) D.2736_42; D.2738_44 = synth_7(D) + D.2737_43; __builtin_memset (D.2738_44, 0, 32); D.2739_45 = (long unsigned int) ch_28; D.2740_46 = D.2739_45 * 64; D.2741_47 = (long unsigned int) s_29; D.2742_48 = D.2740_46 + D.2741_47; D.2743_49 = D.2742_48 * 32; D.2744_50 = synth_7(D) + D.2743_49; __builtin_memset (D.2744_50, 0, 32); goto bb 5; bb 8: ch_10 = ch_28 + 1; if (ch_10 != 2) goto bb 9; else goto bb 11; bb 9: bb 10: # ch_28 = PHI 0(2), ch_10(9) goto bb 7; bb 11: return; } and the assembler: mad_synth_mute: .LFB0: .cfi_startproc movq%rdi, %r9 xorl%r8d, %r8d .L2: leaq16(%r8), %rsi movq%r9, %rdx movq%r8, %rax .p2align 4,,10 .p2align 3 .L3: leaq48(%rax), %rcx salq$5, %rcx addq%rdi, %rcx movq$0, (%rcx) movq$0, 8(%rcx) movq$0, 16(%rcx) movq$0, 24(%rcx) leaq32(%rax), %rcx salq$5, %rcx addq%rdi, %rcx movq$0, (%rcx) movq$0, 8(%rcx) movq$0, 16(%rcx) movq$0, 24(%rcx) leaq16(%rax), %rcx addq$1, %rax salq$5, %rcx addq%rdi, %rcx movq$0, (%rcx) movq$0, 8(%rcx) movq$0, 16(%rcx) movq$0, 24(%rcx) movq$0, (%rdx) movq$0, 8(%rdx) movq$0, 16(%rdx) movq$0, 24(%rdx) addq$32, %rdx cmpq%rsi, %rax jne.L3 addq$64, %r8 addq$2048, %r9 cmpq$128, %r8 jne.L2 rep ret
[Bug tree-optimization/45314] [4.5/4.6 Regression] ICE: error: in remove_unreachable_handlers, at tree-sh.c:3294 with -O2 -floop-interchange
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45314 --- Comment #7 from sebpop at gmail dot com sebpop at gmail dot com 2010-11-05 16:51:22 UTC --- On Fri, Nov 5, 2010 at 11:26, hjl.tools at gmail dot com gcc-bugzi...@gcc.gnu.org wrote: On trunk, it was fixed by revision 163123: http://gcc.gnu.org/ml/gcc-cvs/2010-08/msg00334.html Thanks HJ for reducing this. I looked at this change and it looks simple enough to backport it to 4.5.
[Bug tree-optimization/45314] [4.5 Regression] ICE: error: in remove_unreachable_handlers, at tree-sh.c:3294 with -O2 -floop-interchange
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45314 --- Comment #10 from sebpop at gmail dot com sebpop at gmail dot com 2010-11-05 18:17:01 UTC --- Here is the backported patch that fixes the ICE. I will further test this and will post to gcc-patches. Sebastian
[Bug tree-optimization/46186] Clang creates code running 1600 times faster than gcc's
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #23 from sebpop at gmail dot com sebpop at gmail dot com 2010-10-29 21:44:09 UTC --- Hi, here is a preliminary patch (not tested yet other that the PR testcase). This patch improves chrec_apply to also handle these very uncommon cases that some like to make big titles about (I wonder if the guy who submitted this bug report is part of some marketing division... anyways) Note that for these cases F (4, sum += a * a) F (5, sum += a * a * a) F (6, sum += a * a * a * a * a + 2 * a * a * a + 5 * a) although GCC with this patch knows how to transform these into end of loop values, GCC won't change them, because of this heuristic: /* Do not emit expensive expressions. The rationale is that when someone writes a code like while (n 45) n -= 45; he probably knows that n is not large, and does not want it to be turned into n %= 45. */ || expression_expensive_p (def)) one needs to also set the --param scev-max-expr-size to a pretty big value for f6 to pass the fold steps... Sebastian
[Bug fortran/44660] [regression 4.4/4.5/4.6] ICE in resolve_equivalence()
--- Comment #6 from sebpop at gmail dot com 2010-06-25 06:07 --- Subject: Re: [regression 4.4/4.5/4.6] ICE in resolve_equivalence() These previous patches don't seem to solve the problem: here is another reduced case that still fails in resolve_equivalence at a different place than before. $ cat bug.f CALL TRFWTM(JKT,XX,NX,Y,NIX,NORB2,1,TOL) IF(DBUG.AND.NX.GT.0) THEN EQUIVALENCE (DBUGME, DBUGME_STR) END IF END -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44660
[Bug fortran/44660] [regression 4.4/4.5/4.6] ICE in resolve_equivalence()
--- Comment #9 from sebpop at gmail dot com 2010-06-25 06:24 --- Subject: Re: [regression 4.4/4.5/4.6] ICE in resolve_equivalence() On Fri, Jun 25, 2010 at 01:14, kargl at gcc dot gnu dot org gcc-bugzi...@gcc.gnu.org wrote: ... there is a 200 line difference in the location of your diff and my clean trunk. Do you have other changes in your source code? Sorry, my patch was against the graphite branch, last merged on 2010-06-07, r160224 from trunk. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44660
[Bug fortran/44660] [regression 4.4/4.5/4.6] ICE in resolve_equivalence()
--- Comment #4 from sebpop at gmail dot com 2010-06-25 05:32 --- Subject: Re: [regression 4.4/4.5/4.6] ICE in resolve_equivalence() On Thu, Jun 24, 2010 at 23:42, kargl at gcc dot gnu dot org gcc-bugzi...@gcc.gnu.org wrote: The mangled Fortran code caught my eye. I'm actually wondering where Sebastian found this gem. I was reducing a graphite ICE in gamess that turned out into a fortran front-end ICE... With your fix I will start reducing again my ICE ;-) Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44660
[Bug fortran/44660] [regression 4.4/4.5/4.6] ICE in resolve_equivalence()
--- Comment #5 from sebpop at gmail dot com 2010-06-25 05:49 --- Subject: Re: [regression 4.4/4.5/4.6] ICE in resolve_equivalence() On Thu, Jun 24, 2010 at 23:02, kargl at gcc dot gnu dot org gcc-bugzi...@gcc.gnu.org wrote: --- Comment #1 from kargl at gcc dot gnu dot org 2010-06-25 04:02 --- Index: resolve.c === --- resolve.c (revision 161047) +++ resolve.c (working copy) @@ -12506,6 +12506,9 @@ resolve_equivalence (gfc_equiv *eq) int object, cnt_protected; const char *msg; + if (eq-expr-symtree-n.sym == NULL) + return; + last_ts = eq-expr-symtree-n.sym-ts; first_sym = eq-expr-symtree-n.sym; This patch doesn't fix the problem I am seeing. If I'm testing this in the loop before taking the value of e-symtree-n.sym-ts, then it passes without ICE: diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c index 48bb618..7f66be4 100644 --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -12360,6 +12360,9 @@ resolve_equivalence (gfc_equiv *eq) { e = eq-expr; + if (eq-expr-symtree-n.sym == NULL) + return; + e-ts = e-symtree-n.sym-ts; /* match_varspec might not know yet if it is seeing array reference or substring reference, as it doesn't -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44660
[Bug middle-end/43519] [graphite] Bootstrap with Graphite enabled fails in Java libs
--- Comment #3 from sebpop at gmail dot com 2010-04-05 17:30 --- Subject: Re: [graphite] Bootstrap with Graphite enabled fails in Java libs On Mon, Apr 5, 2010 at 04:47, rguenth at gcc dot gnu dot org gcc-bugzi...@gcc.gnu.org wrote: You shouldn't be using type_for_size but instead use build_nonstandard_integer_type. I copied this from another LNO pass, should I also update that pass? What about this patch? Sebastian --- Comment #4 from sebpop at gmail dot com 2010-04-05 17:30 --- Created an attachment (id=20313) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20313action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43519
[Bug middle-end/43519] [graphite] Bootstrap with Graphite enabled fails in Java libs
--- Comment #7 from sebpop at gmail dot com 2010-04-06 05:54 --- Subject: Re: [graphite] Bootstrap with Graphite enabled fails in Java libs On Mon, Apr 5, 2010 at 22:16, spop at gcc dot gnu dot org gcc-bugzi...@gcc.gnu.org wrote: URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=157977 Log: Use build_nonstandard_integer_type. This commit seems to create problems both in chrec_convert and in the niter estimations: these use unsigned_type_for and signed_type_for that fail by returning NULL_TREE when the type is one that is returned by build_nonstandard_integer_type. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43519
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #32 from sebpop at gmail dot com 2010-03-25 17:43 --- Subject: Re: [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90 On Wed, Mar 24, 2010 at 16:35, howarth at nitro dot med dot uc dot edu gcc-bugzi...@gcc.gnu.org wrote: Fixed. Please use ftp://gcc.gnu.org/pub/gcc/infrastructure/cloog-ppl-0.15.9.tar.gz Shouldn't the required cloog-ppl version in configure be bumped from 0.15.5 to 0.15.9? Richi what do you think? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug middle-end/43464] copy prop breaks loop closed SSA form
--- Comment #5 from sebpop at gmail dot com 2010-03-21 16:08 --- Subject: Re: copy prop breaks loop closed SSA form On Sun, Mar 21, 2010 at 04:54, steven at gcc dot gnu dot org gcc-bugzi...@gcc.gnu.org wrote: Why such a big hammer? 'cause I don't want to add more bugs, and no I don't think compile time matters. You should be able to figure out which copy props are allowed and which should be disallowed in loop-closed SSA form. patches are welcome. Is if (current_loops) the right test here? This will break if Zdenek's patches to keep loops around throughout ever makes it to the trunk. please propose a better patch. Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43464
[Bug middle-end/42181] [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90
--- Comment #26 from sebpop at gmail dot com 2010-03-21 16:28 --- Subject: Re: [4.5 Regression][graphite] -fgraphite-identity miscompiles air.f90 On Sat, Mar 20, 2010 at 05:45, dominiq at lps dot ens dot fr wrote: Do you understand why graphite does not detect that the loop for [scat_1+1, T_10-2] depends on the one for [0, scat_1-1]? Graphite does know this, but it does not ask CLooG to generate [0, scat_1-1] after [scat_1+1, T_10-2], however CLooG does generate it, so I am thinking that this is a problem in CLooG. Second question why does graphite exchange the order of the split loops? CLooG does that. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42181
[Bug tree-optimization/43423] gcc should vectorize this loop through iteration range splitting
--- Comment #3 from sebpop at gmail dot com 2010-03-18 18:33 --- Subject: Re: gcc should vectorize this loop through iteration range splitting Well it could be vectorized even without range splitting. The issue is the sinking of the store to a[i]. You mean that the problem is the if-conversion of the stores a[i] = ... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423
[Bug tree-optimization/43209] [4.5 Regression] ICE in try_improve_iv_set, at tree-ssa-loop-ivopts.c:5238
--- Comment #6 from sebpop at gmail dot com 2010-03-01 18:10 --- Subject: Re: [4.5 Regression] ICE in try_improve_iv_set, at tree-ssa-loop-ivopts.c:5238 On Mon, Mar 1, 2010 at 12:02, changpeng dot fang at amd dot com I have a fix for this problem. We should not decrease the cost if the cost is infinite. Looks good. Thanks for fixing this. Please test with the minor modification below, and submit a patch to gcc-patches@ diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index 74dadf7..9accda9 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -4124,7 +4124,11 @@ determine_use_iv_cost_condition (struct ivopts_data *data, if (integer_zerop (*bound_cst) (operand_equal_p (*control_var, cand-var_after, 0) || operand_equal_p (*control_var, cand-var_before, 0))) - elim_cost.cost -= 1; + { + /* Should not decrease the cost if it is infinite */ + if (!infinite_cost_p (elim_cost)) You should fuse this condition into the previous condition expression to avoid the inner if. + elim_cost.cost -= 1; + } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43209
[Bug tree-optimization/43209] [4.5 Regression] ICE in try_improve_iv_set, at tree-ssa-loop-ivopts.c:5238
--- Comment #7 from sebpop at gmail dot com 2010-03-01 18:21 --- Subject: Re: [4.5 Regression] ICE in try_improve_iv_set, at tree-ssa-loop-ivopts.c:5238 You should fuse this condition into the previous condition expression to avoid the inner if. Like this: diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index 74dadf7..3b766ed 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -4121,7 +4121,8 @@ determine_use_iv_cost_condition (struct ivopts_data *data, TODO: The constant that we're substracting from the cost should be target-dependent. This information should be added to the target costs for each backend. */ - if (integer_zerop (*bound_cst) + if (!infinite_cost_p (elim_cost) + integer_zerop (*bound_cst) (operand_equal_p (*control_var, cand-var_after, 0) || operand_equal_p (*control_var, cand-var_before, 0))) elim_cost.cost -= 1; -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43209
[Bug tree-optimization/42771] [4.5 Regression][graphite] ICE: in graphite_loop_normal_form, at graphite-sese-to-poly.c (2)
--- Comment #11 from sebpop at gmail dot com 2010-02-11 00:29 --- Subject: Re: [4.5 Regression][graphite] ICE: in graphite_loop_normal_form, at graphite-sese-to-poly.c (2) On Wed, Feb 10, 2010 at 12:26, amonakov at gcc dot gnu dot org gcc-bugzi...@gcc.gnu.org wrote: I don't see how this patch makes simple_iv call from number_of_iterations_exit return true for j_20. Could you please kindly explain? We used to analyze the second scop after the code generation of the first one. In that context, the scalar evolution analysis failed to analyze the code containing scalar computations stored and read from arrays with 1 element (introduced by the code generation and analysis part). We now analyze all the scops before code generating them: thus, we don't have to invalidate the scalar evolution hash tables between the analysis of two scops. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42771
[Bug tree-optimization/42558] [4.5 Regression][graphite] miscompilation related to -floop-block
--- Comment #6 from sebpop at gmail dot com 2010-02-06 19:42 --- Subject: Re: [4.5 Regression][graphite] miscompilation related to -floop-block Is IMPLICIT NONE INTEGER, PARAMETER :: dp=KIND(0.0D0) REAL(KIND=dp) :: res res=exp_radius_very_extended( 0 , 1 , 0 , 1, (/0.0D0,0.0D0,0.0D0/), (/1.0D0,0.0D0,0.0D0/), (/1.0D0,0.0D0,0.0D0/), 1.0D0,1.0D0,1.0D0,1.0D0) if (res.ne.1.0d0) call abort() CONTAINS ... what you want? Yes, thanks, I will include this in the testsuite. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42558
[Bug tree-optimization/42521] [4.5 Regression] ICE: in graphite_loop_normal_form, at graphite-sese-to-poly.c:2844
--- Comment #10 from sebpop at gmail dot com 2010-01-13 18:15 --- Subject: Re: [4.5 Regression] ICE: in graphite_loop_normal_form, at graphite-sese-to-poly.c:2844 pdv_d.f:89:0: error: definition in block 40 does not dominate use in block 212 for SSA_NAME: prephitmp.28_439 in statement: D.2771_606 = D.2770_605 = prephitmp.28_439; The error comes from the fact that we are not clearing the scev information anymore in between the code generation of two scops. In this particular case, we have two scops, the second scop contains a loop for which the number of iterations is a variable computed in the first scop, and because we do not update the niter/scev info we keep referring to the old SSA_NAME, prephitmp.28_439. A solution would be to rename all the scev info based on the rename_map that is computed by the translation of the first scop. I am working on a patch to do that. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42521
[Bug middle-end/42512] [4.5 Regression] integer wrong code bug with loop
--- Comment #11 from sebpop at gmail dot com 2010-01-08 17:55 --- Subject: Re: [4.5 Regression] integer wrong code bug with loop Ok, I have that fixed locally at the place of the patch but I wonder if initial_condition () shouldn't return for example  1ul for (unsigned long) { 1, +, 1 }_1 This is correct. and  (int) i_2 for (int) { i_2, +, 1 }_1 and further (for short i_2)  i_2 for (short) { (int) { i_2, +, 1 }_2, +, 1 }_1 ?  Can the latter two happen all? Yes, these could happen, and you are right, we should see the initial value of a chrec through the type conversion lenses. Is it even correct to talk about a general initial condition in this case?  Consider  { { 1, +, 1 }_2, +, 1 }_1 initial_condition will return 1 for the chrec even though that is not correct because the initial condition is not constant in loop 1. If you want, there is an initial condition for the loop_1 and that would be {1, +, 1}_2, and there is an initial condition 1 for loop nest loop_2: loop_2 i = loop_2_phi (0, i+1) = {1, +, 1}_2 loop_1 j = loop_1_phi (i, j+1) = {{1, +, 1}_2, +, 1}_1 I suppose I'd only see that if instantiating the chrec at the point where I placed the fix?  So I really only see at most a single outer conversion around the chrec? Yes, I think that at most you can have only one conversion around a chrec. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42512
[Bug tree-optimization/42641] Random code-generation differences with GRAPHITE
--- Comment #6 from sebpop at gmail dot com 2010-01-07 17:58 --- Subject: Re: Random code-generation differences with GRAPHITE After your change, there remains three users of htab_hash_pointer in graphite: In if_region_set_false_region, there is a use of htab_hash_pointer, but that matches the use of the loops-exits htab as also used in get_exit_descriptions. The next two, are: hashval_t ivtype_map_elt_info (const void *elt) { return htab_hash_pointer (((const struct ivtype_map_elt_s *) elt)-cloog_iv); } static inline hashval_t clast_name_index_elt_info (const void *elt) { return htab_hash_pointer (((const struct clast_name_index *) elt)-name); } and they are a bit more difficult to change, as it is the interface with CLooG that uses a char * to identify loop induction variables. In both cases, we're hashing on that string identifier. Should these two functions be changed as well? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42641
[Bug tree-optimization/42641] Random code-generation differences with GRAPHITE
--- Comment #9 from sebpop at gmail dot com 2010-01-07 21:30 --- Subject: Re: Random code-generation differences with GRAPHITE htab_hash_pointer is fine if a hash table is never traversed, or such traversal can't affect code generation. Â E.g. graphite has some debug_* routines that traverse such hash tables, that's fine, they aren't called at all during compilation except for debugging sessions. Ok, thanks for the detailed explanation. The two other htabs using htab_hash_pointer ivtype_map_elt_info and clast_name_index_elt_info are safe. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42641
[Bug testsuite/42135] FAIL: libgomp.graphite/force-parallel-2.c execution test
--- Comment #1 from sebpop at gmail dot com 2009-11-24 06:51 --- Subject: Re: New: FAIL: libgomp.graphite/force-parallel-2.c execution test On Sat, Nov 21, 2009 at 14:57, dominiq at lps dot ens dot fr gcc-bugzi...@gcc.gnu.org wrote: Since revision 150792, the test libgomp.graphite/force-parallel-2.c (introduced in revision 150755) fails on *-apple-darwin9. AFAICT the array x[1][1] is allocated in stack and is too big for the 64Mb hard limit on darwin. One solution could be to replace 1 with 4000. Also the following patch works. Please update the size of arrays. Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42135
[Bug tree-optimization/41811] graphite miscompiles 454.calculix of the SPEC 2k6
--- Comment #4 from sebpop at gmail dot com 2009-10-23 19:54 --- Subject: Re: graphite miscompiles 454.calculix of the SPEC 2k6 On Fri, Oct 23, 2009 at 14:46, spop at gcc dot gnu dot org gcc-bugzi...@gcc.gnu.org wrote: and the code generated by CLooG for the interchange looks like this: for (scat_1=0;scat_1=2;scat_1++) { Â for (scat_3=0;scat_3=2;scat_3++) { Â Â S4(scat_1,scat_3) ; Â Â for (scat_5=0;scat_5=2;scat_5++) { Â Â Â S5(scat_1,scat_5,scat_3) ; Â Â } Â Â S7(scat_1,scat_3) ; Â Â S18(scat_1,scat_3) ; Â } S7 and S18 should not be generated before S5 finishes to execute over all the iterations of the original innermost loop (do k=1,20). S7 and S18 contain the end of the reduction and the write in the array xs(i,j) that is independent of the k loop. Â for (scat_3=3;scat_3=19;scat_3++) { Â Â for (scat_5=0;scat_5=2;scat_5++) { Â Â Â S5(scat_1,scat_5,scat_3) ; Â Â } Â } } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41811
[Bug tree-optimization/41406] -O3 conflict with -floop-strip-mine internal compiler error: in build_loop_iteration_domains, at graphite-sese-to-poly.c:1156
--- Comment #4 from sebpop at gmail dot com 2009-09-19 23:31 --- Subject: Re: -O3 conflict with -floop-strip-mine internal compiler error: in build_loop_iteration_domains, at graphite-sese-to-poly.c:1156 Could you run gdb and report the backtrace? # gdb build-gcc/gcc/cc1 (gdb) run -O3 -floop-strip-mine ... aes.i (gdb) bt Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41406
[Bug middle-end/40981] aermod.f90 ICEs on -O2 -fgraphite-identity -floop-strip-mine
--- Comment #21 from sebpop at gmail dot com 2009-08-14 17:16 --- Subject: Re: aermod.f90 ICEs on -O2 -fgraphite-identity -floop-strip-mine Actually the error in gdb has changed with 1677_max.diff... As expected: see the gcc_assert in the patch +/* Return in RES the maximum of the linear expression LE on polyhedron PS. */ + +void +ppl_max_for_le (ppl_Pointset_Powerset_C_Polyhedron_t ps, + ppl_Linear_Expression_t le, Value res) +{ + ppl_Coefficient_t num, denom; + Value dv, nv; + int maximum; + + value_init (nv); + value_init (dv); + ppl_new_Coefficient (num); + ppl_new_Coefficient (denom); + ppl_Pointset_Powerset_C_Polyhedron_maximize (ps, le, num, denom, maximum); + + if (maximum) +{ + ppl_Coefficient_to_mpz_t (num, nv); + ppl_Coefficient_to_mpz_t (denom, dv); + gcc_assert (value_notzero_p (dv)); + value_division (res, nv, dv); +} + + value_clear (nv); + value_clear (dv); + ppl_delete_Coefficient (num); + ppl_delete_Coefficient (denom); +} What worries me is that PPL finds a maximum, but that is not a valid max, as the denominator is zero. Roberto, could you please look at the bug http://gcc.gnu.org/PR40981 ? Thanks, Sebastian Program exited with code 04. (gdb) bt No stack. so I am no longer able to get a back trace. That's because you do not load the gdbinit.in from the gcc dir, where you have the following breakpoints: # Put breakpoints at exit and fancy_abort in case abort is mapped # to either fprintf/exit or fancy_abort. b fancy_abort # Put a breakpoint on internal_error to help with debugging ICEs. b internal_error -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40981
[Bug middle-end/40965] [4.5 Regression][graphite] slow compilation
--- Comment #6 from sebpop at gmail dot com 2009-08-05 14:04 --- Subject: Re: [4.5 Regression][graphite] slow compilation What changed from 4.4 to 4.5 is that we now get to compile larger SCoPs with Graphite. In 4.5, Graphite can deal with reductions and other unhandled constructs like the NE_EXPR that Fortran is frequently using for representing the exit condition of DO loops. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40965
[Bug middle-end/40965] [graphite] slow compilation
--- Comment #3 from sebpop at gmail dot com 2009-08-05 04:42 --- Subject: Re: [graphite] slow compilation On Tue, Aug 4, 2009 at 17:44, rguenth at gcc dot gnu dot orggcc-bugzi...@gcc.gnu.org wrote: Eh - where's that exponential time complexity? ... The code generation of Graphite can be exponential, didn't I mentioned it yet? We should find some cutting factor, function of the number of loops in a SCoP to avoid these long compile times. One other solution is to use the PPL watchdog utility to cut off the exponential operations in a deterministic way. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40965
[Bug bootstrap/40103] CLooG header files are not -Wc++-compat ready
--- Comment #8 from sebpop at gmail dot com 2009-06-09 18:17 --- Subject: Re: CLooG header files are not -Wc++-compat ready On Tue, Jun 9, 2009 at 12:42, joseph at codesourcery dot comgcc-bugzi...@gcc.gnu.org wrote: I think you should allow more time for people to update after preparing a fixed tarball for the infrastructure directory; won't this have broken bootstrap for everyone using any existing cloog-ppl release tarball (as referenced in install.texi on trunk)? Yes, this would break the bootstrap unless you update the cloog sources from cloog-ppl git. I will revert this patch. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40103
[Bug tree-optimization/40062] [4.3/4.4/4.5 Regression] high memory usage and compile time in SCEV cprop with -O3
--- Comment #4 from sebpop at gmail dot com 2009-05-08 12:12 --- Subject: Re: [4.3/4.4/4.5 Regression] high memory usage and compile time in SCEV cprop with -O3 + Â Â Â /* Increase the limit by the PHI argument number to avoid exponential + Â Â Â Â time and memory complexity. Â */ This looks good. Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40062
[Bug middle-end/39568] [graphite] Remove GBB_LOOPS
--- Comment #3 from sebpop at gmail dot com 2009-03-30 20:03 --- Subject: Re: [graphite] Remove GBB_LOOPS Awesome! Thanks Li, the patch looks good. Tobias will take care of including it to the graphite branch. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39568
[Bug tree-optimization/35011] ICE with -fcheck-data-deps
--- Comment #10 from sebpop at gmail dot com 2009-03-29 02:15 --- Subject: Re: ICE with -fcheck-data-deps The bug disappeared on the trunk between 2009-03-15 and 2009-03-20. This bug might have been fixed by PR39500. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35011
[Bug c/39500] autopar fails to parallel
--- Comment #5 from sebpop at gmail dot com 2009-03-19 02:52 --- Subject: Re: autopar fails to parallel On Wed, Mar 18, 2009 at 20:46, nemokingdom at gmail dot com gcc-bugzi...@gcc.gnu.org wrote: I add test case for this situation. Yes, indeed this is a good idea. Thanks for the testcase, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39500
[Bug c/39500] autopar fails to parallel
--- Comment #6 from sebpop at gmail dot com 2009-03-19 02:53 --- Subject: Re: autopar fails to parallel What |Removed |Added Component|middle-end |c Please leave the Component to be middle-end, this bug is not related to the c language. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39500
[Bug middle-end/39447] ICE in create_data_ref with -O1 -floop-interchange
--- Comment #5 from sebpop at gmail dot com 2009-03-16 22:34 --- Subject: Re: ICE in create_data_ref with -O1 -floop-interchange Thanks for the reduced testcase, it completely went out of my radar (by now my delta script should have finished reducing it as well on the gcc-farm, but I won't even look at it). Thanks again for the reduced case. I will look at the bug now. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39447
[Bug middle-end/39447] ICE in create_data_ref with -O1 -floop-interchange
--- Comment #6 from sebpop at gmail dot com 2009-03-16 23:18 --- Subject: Re: ICE in create_data_ref with -O1 -floop-interchange Hi, I don't know who coded the overly complicated exclude_component_ref. In the graphite branch we already cleaned up all this code, but in trunk we still have it. Attached is a patch that fixes the problem by looking at whether the operand contains COMPONENT_REFs before calling the data reference analysis. I'm testing the patch on the gcc farm, and will send it to the gcc-patches once it finishes regstrap. Sebastian --- Comment #7 from sebpop at gmail dot com 2009-03-16 23:18 --- Created an attachment (id=17470) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17470action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39447
[Bug middle-end/39335] ICE in GCC 4.4 with -O[123] -floop-interchange
--- Comment #5 from sebpop at gmail dot com 2009-03-02 18:57 --- Subject: Re: ICE in GCC 4.4 with -O[123] -floop-interchange Hi, The only thing that graphite modifies is from canonicalize_loop_ivs: here is the diff between 1 that is the debug_loops (3) before graphite and 2 that is after graphite. --- 1 2009-03-02 12:20:03.0 -0600 +++ 2 2009-03-02 12:20:18.0 -0600 @@ -23,6 +23,8 @@ bb_4 (preds = {bb_3 }, succs = {bb_10 }) { bb 4: +D.1655_27 = (unsigned int) width_12(D); +D.1656_25 = D.1655_27 + 4294967295; goto bb 10; } @@ -90,6 +92,8 @@ bb_11 (preds = {bb_10 }, succs = {bb_6 }) { bb 11: + D.1652_24 = (unsigned int) num_comp_14(D); + D.1653_6 = D.1652_24 + 4294967295; goto bb 6; } @@ -98,16 +102,18 @@ bb_5 (preds = {bb_6 }, succs = {bb_6 }) { bb 5: +ivtmp.25_2 = ivtmp.25_4 + 1; } bb_6 (preds = {bb_5 bb_11 }, succs = {bb_5 bb_7 }) { bb 6: -# in_47 = PHI in_17(5), in_21(11) -# out_48 = PHI out_16(5), out_32(11) -# i_49 = PHI i_18(5), 0(11) # SMT.10_50 = PHI SMT.10_30(5), SMT.10_34(11) # SMT.11_52 = PHI SMT.11_31(5), SMT.11_35(11) +# ivtmp.25_4 = PHI ivtmp.25_2(5), 0(11) +in_47 = in_21 + ivtmp.25_4; +out_48 = out_32 + ivtmp.25_4; +i_49 = (int) ivtmp.25_4; # VUSE SMT.10_50, SMT.11_52 { SMT.10 SMT.11 } D.1617_15 = *in_47; # SMT.10_30 = VDEF SMT.10_50 @@ -116,7 +122,7 @@ out_16 = out_48 + 1; in_17 = in_47 + 1; i_18 = i_49 + 1; -if (num_comp_14(D) i_18) +if (ivtmp.25_4 D.1653_6) goto bb 5; else goto bb 7; The fail is in RTL expand in copy_to_mode_reg: gcc_assert (GET_MODE (x) == mode || GET_MODE (x) == VOIDmode); (gdb) p x-mode $16 = SImode (gdb) p mode $17 = DImode It looks like a type problem for the condition to be expanded: (gdb) p exp $18 = (tree) 0x7fa9967d2e00 (gdb) pgs ivtmp.25 D.1653; So after figuring out that canonicalize_loop_ivs does compute the largest precision for all the phi nodes of the loop, such that the new induction variable can represent all the values of the old IVs, i.e: for (psi = gsi_start_phis (loop-header); !gsi_end_p (psi); gsi_next (psi)) { phi = gsi_stmt (psi); res = PHI_RESULT (phi); if (is_gimple_reg (res) TYPE_PRECISION (TREE_TYPE (res)) precision) precision = TYPE_PRECISION (TREE_TYPE (res)); } type = lang_hooks.types.type_for_size (precision, 1); it does not fold_convert the number of iterations to this new type, and thus we end up building a condition with two different precision types: 32 for niter and 64 for the new IV. Attached is a fix for this problem, and the diff between 1 before and 5 after graphite looks like this: --- 1 2009-03-02 12:20:03.0 -0600 +++ 5 2009-03-02 12:54:27.0 -0600 @@ -23,6 +23,8 @@ bb_4 (preds = {bb_3 }, succs = {bb_10 }) { bb 4: +D.1656_25 = (unsigned int) width_12(D); +D.1657_5 = D.1656_25 + 4294967295; goto bb 10; } @@ -90,6 +92,9 @@ bb_11 (preds = {bb_10 }, succs = {bb_6 }) { bb 11: + D.1652_24 = (unsigned int) num_comp_14(D); + D.1653_6 = D.1652_24 + 4294967295; + D.1654_4 = (long unsigned int) D.1653_6; goto bb 6; } @@ -98,16 +103,18 @@ bb_5 (preds = {bb_6 }, succs = {bb_6 }) { bb 5: +ivtmp.25_27 = ivtmp.25_2 + 1; } bb_6 (preds = {bb_5 bb_11 }, succs = {bb_5 bb_7 }) { bb 6: -# in_47 = PHI in_17(5), in_21(11) -# out_48 = PHI out_16(5), out_32(11) -# i_49 = PHI i_18(5), 0(11) # SMT.10_50 = PHI SMT.10_30(5), SMT.10_34(11) # SMT.11_52 = PHI SMT.11_31(5), SMT.11_35(11) +# ivtmp.25_2 = PHI ivtmp.25_27(5), 0(11) +in_47 = in_21 + ivtmp.25_2; +out_48 = out_32 + ivtmp.25_2; +i_49 = (int) ivtmp.25_2; # VUSE SMT.10_50, SMT.11_52 { SMT.10 SMT.11 } D.1617_15 = *in_47; # SMT.10_30 = VDEF SMT.10_50 @@ -116,7 +123,7 @@ out_16 = out_48 + 1; in_17 = in_47 + 1; i_18 = i_49 + 1; -if (num_comp_14(D) i_18) +if (ivtmp.25_2 D.1654_4) goto bb 5; else goto bb 7; Sebastian Pop -- AMD - GNU Tools --- Comment #6 from sebpop at gmail dot com 2009-03-02 18:57 --- Created an attachment (id=17386) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17386action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39335
[Bug middle-end/39308] ICE when compiling with -O[s123] -floop-interchange
--- Comment #10 from sebpop at gmail dot com 2009-02-26 20:10 --- Subject: Re: ICE when compiling with -O[s123] -floop-interchange Hi, Can you try this patch. It should fix your problem. I will bootstrap and test the patch and send it for review. Thanks, Sebastian Pop -- AMD - GNU Tools On Thu, Feb 26, 2009 at 13:46, il dot basso dot buffo at gmail dot com gcc-bugzi...@gcc.gnu.org wrote: --- Comment #9 from il dot basso dot buffo at gmail dot com 2009-02-26 19:46 --- Thanks, Sebastian. I followed your directions, except I used -O1 instead of -O2. Here's the backtrace: #0 is_gimple_val (t=0x0) at ../.././gcc/gimple.c:2853 #1 0x0055dec4 in force_gimple_operand (expr=0x0, stmts=0x7fffdc78a538, simple=1 '\001', var=0x0) at ../.././gcc/gimplify.c:7592 #2 0x0097fc22 in build_scop_loop_nests (scop=0x12ce530) at ../.././gcc/graphite.c:2387 #3 0x00981f1a in limit_scops () at ../.././gcc/graphite.c:6081 #4 0x00983de7 in graphite_transform_loops () at ../.././gcc/graphite.c:6124 #5 0x006cd137 in graphite_transforms () at ../.././gcc/tree-ssa-loop.c:298 #6 0x005b87ca in execute_one_pass (pass=0xe8ab60) at ../.././gcc/passes.c:1277 #7 0x005b89b0 in execute_pass_list (pass=0xe8ab60) at ../.././gcc/passes.c:1326 #8 0x005b89c5 in execute_pass_list (pass=0xe8a8c0) at ../.././gcc/passes.c:1327 #9 0x005b89c5 in execute_pass_list (pass=0xe89d80) at ../.././gcc/passes.c:1327 #10 0x00678bca in tree_rest_of_compilation (fndecl=0x7fbcd2977100) at ../.././gcc/tree-optimize.c:420 #11 0x00789665 in cgraph_expand_function (node=0x7fbcd2977700) at ../.././gcc/cgraphunit.c:1047 #12 0x0078ab00 in cgraph_optimize () at ../.././gcc/cgraphunit.c:1106 #13 0x00413dcb in c_write_global_declarations () at ../.././gcc/c-decl.c:8102 #14 0x0064301e in toplev_main (argc=value optimized out, argv=value optimized out) at ../.././gcc/toplev.c:981 #15 0x7fbcd315c60d in __libc_start_main () from /lib/libc.so.6 #16 0x00405229 in _start () -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39308 --- Comment #11 from sebpop at gmail dot com 2009-02-26 20:10 --- Created an attachment (id=17369) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17369action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39308
[Bug bootstrap/39262] [miro] - Revision 144368 - Werror in ../miro/gcc/genconstants.c
--- Comment #5 from sebpop at gmail dot com 2009-02-22 14:55 --- Subject: Re: [miro] - Revision 144368 - Werror in ../miro/gcc/genconstants.c I will fix this with the attached patch when approved. --- Comment #6 from sebpop at gmail dot com 2009-02-22 14:55 --- Created an attachment (id=17342) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17342action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39262
[Bug middle-end/38868] r143152 breaks output routines in xplor-nih
--- Comment #21 from sebpop at gmail dot com 2009-01-17 15:11 --- Subject: Re: r143152 breaks output routines in xplor-nih On Sat, Jan 17, 2009 at 6:29 AM, dominiq at lps dot ens dot fr gcc-bugzi...@gcc.gnu.org wrote: Somehow I got the impression that graphite is now enabled at -O2 We did enabled -floop-block and -fgraphite-identity in -O2 and higher, again, only in the graphite branch, not in trunk. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38868
[Bug middle-end/38431] [graphite] several ICEs with CP2K (summary)
--- Comment #33 from sebpop at gmail dot com 2009-01-14 10:20 --- Subject: Re: [graphite] several ICEs with CP2K (summary) Attached a fix for this PR. I will regstrap and submit for review. Sebastian --- Comment #34 from sebpop at gmail dot com 2009-01-14 10:20 --- Created an attachment (id=17097) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17097action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38431
[Bug testsuite/38791] FAIL: gcc.dg/graphite/block-3.c (test for excess errors)
--- Comment #8 from sebpop at gmail dot com 2009-01-14 14:45 --- Subject: Re: FAIL: gcc.dg/graphite/block-3.c (test for excess errors) Before closing this pr as fixed, I have a question: usually tests having -fdump-* in dg-options are doing some search of patterns in the dumped file, e.g. in gcc/testsuite/gcc.dg/pr35729.c /* { dg-options -Os -fdump-rtl-loop2_invariant } */ ... /* { dg-final { scan-rtl-dump-times Decided to move invariant 0 loop2_invariant } } */ I noticed that gcc/testsuite/gcc.dg/graphite/block-3.c has only the cleaning dg-final, but no scan-* one(s). I don't see anything in gcc/testsuite/gcc.dg/graphite/graphite.exp that could supply it either. Is this the intended behavior or is there something missing in this test (and possibly other graphite ones)? The test for loop blocking is missing in block-3.c. We will have to clean up the graphite testsuite and making the tests more reliable, but probably this will be done in GCC4.5. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38791
[Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
--- Comment #1 from sebpop at gmail dot com 2009-01-14 18:42 --- Subject: Re: New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) Hi, Thanks for this report. Please also test with the code of graphite branch that contains a patch that schedules several scalar optimizations that can improve the quality of the code generated. Thanks, Sebastian Pop -- AMD - GNU Tools -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846
[Bug middle-end/38431] [graphite] several ICEs with CP2K (summary)
--- Comment #28 from sebpop at gmail dot com 2009-01-13 19:52 --- Subject: Re: [graphite] several ICEs with CP2K (summary) Hi, I compiled BLAS and LAPACK with the gfortran compiler of the graphite branch such that I could test the CP2K benchmark. On my laptop, that is an amd64-linux, make test passes with the gfortran compiler from the graphite branch. However I'm not able to run the test that you reported failing: ./cp2k.sopt canonical.inp CP2K: The specified file canonical.inp can not be opened, it does not exist. STOP 1 Could you tell me where I can find the canonical.inp file, or how to reproduce the fail? Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38431
[Bug middle-end/38431] [graphite] several ICEs with CP2K (summary)
--- Comment #30 from sebpop at gmail dot com 2009-01-13 21:57 --- Subject: Re: [graphite] several ICEs with CP2K (summary) Thanks for the clarification, I managed to reproduce the fail. The problem comes from the fact that we do not generate code for a scalar reduction that is not detected as a scalar reduction with the variable connection$dim$1$lbound. In the attached output from debug_loops (3) I selected the region of code containing both the original loops: loop_3 and loop_4 and the code generated by graphite with -fgraphite-identity: loop_22, loop_23. In loop_22 the computation on connection$dim$1$lbound disappears. I wonder what this variable stands for: it is not used elsewhere in the debug_loops (3) output of change_bond_length function, and I suspect that this is a global variable whose value is needed elsewhere outside the change_bond_length function. The bug is in the detection of scalar reductions. Sebastian --- Comment #31 from sebpop at gmail dot com 2009-01-13 21:57 --- Created an attachment (id=17095) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17095action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38431
[Bug middle-end/38431] [graphite] several ICEs with CP2K (summary)
--- Comment #27 from sebpop at gmail dot com 2009-01-11 13:42 --- Subject: Re: [graphite] several ICEs with CP2K (summary) On Sun, Jan 11, 2009 at 6:58 AM, jv244 at cam dot ac dot uk gcc-bugzi...@gcc.gnu.org wrote: I'll see if I can narrow down the problem to the single subroutine (change_bond_length) which I suspect is the issue. [all of this with trunk 143207] yes, just looking at change_bond_length should be enough. I'm looking at the code generated for this function. Thanks for the detailed analysis. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38431
[Bug testsuite/38791] FAIL: gcc.dg/graphite/block-3.c (test for excess errors)
--- Comment #2 from sebpop at gmail dot com 2009-01-10 21:32 --- Subject: Re: FAIL: gcc.dg/graphite/block-3.c (test for excess errors) Does the attached patch fix the fail? Thanks, Sebastian --- Comment #3 from sebpop at gmail dot com 2009-01-10 21:32 --- Created an attachment (id=17072) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17072action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38791
[Bug c/38755] [graphite] wrong code with -O3 -fgraphite-identity -floop-block on polyhedron benchmarks
--- Comment #3 from sebpop at gmail dot com 2009-01-08 16:33 --- Subject: Re: [graphite] wrong code with -O3 -fgraphite-identity -floop-block on polyhedron benchmarks --- Comment #2 from howarth at nitro dot med dot uc dot edu 2009-01-08 16:20 --- What checkin fixed this? Or can't you reproduce the failure? If the latter, what happens if you use -m32 on amd64-linux? I'm running the polyhedron bmk with -m32 and will report the result. But this should be fixed now by the patch for PR38559. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38755
[Bug c/38755] [graphite] wrong code with -O3 -fgraphite-identity -floop-block on polyhedron benchmarks
--- Comment #4 from sebpop at gmail dot com 2009-01-08 17:05 --- Subject: Re: [graphite] wrong code with -O3 -fgraphite-identity -floop-block on polyhedron benchmarks I'm running the polyhedron bmk with -m32 and will report the result. Actually I cannot run the test with -m32 on my machine. Could you please test on i686-apple-darwin9 and report the result. Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38755
[Bug middle-end/38431] [graphite] several ICEs with CP2K (summary)
--- Comment #21 from sebpop at gmail dot com 2009-01-08 19:53 --- Subject: Re: [graphite] several ICEs with CP2K (summary) the testcase provide runs fine (AFAICT) with current trunk. I'll run the full CP2K testsuite to test somewhat better. Thanks for testing. Can you close the bug after the CP2K testsuite passes? Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38431
[Bug middle-end/38431] [graphite] several ICEs with CP2K (summary)
--- Comment #17 from sebpop at gmail dot com 2009-01-07 19:23 --- Subject: Re: [graphite] several ICEs with CP2K (summary) I checked that current trunk (i.e. not graphite branch) still generates a segfaulting executable with FCFLAGS = -g -O2 -ffast-math -funroll-loops -ftree-vectorize -march=native -ffree-form -fgraphite -fgraphite-identity -floop-block -floop-strip-mine -floop-interchange Thanks for the update. I suspect that this is due to -floop-block. There are two more bugs 38559 and 38499 that we're looking at for fixing -floop-block. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38431
[Bug middle-end/38500] [graphite] ICE : in verify_loop_structure, at cfgloop.c:1569
--- Comment #3 from sebpop at gmail dot com 2008-12-12 22:31 --- Subject: Re: [graphite] ICE : in verify_loop_structure, at cfgloop.c:1569 2008-12-12 Jan Sjodin jan.sjo...@amd.com Harsha Jagasia harsha.jaga...@amd.com PR tree-optimization/38500 * gcc.dg/graphite/pr38500.c: New. * graphite.c (create_sese_edges): Call fix_loop_structure after splitting blocks. Okay for both trunk and branch. Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38500
[Bug middle-end/38446] [graphite] The def for a var exists inside one of the scops bb's but an appropriate phi is not created to allow the phi to reach the use of that def ouside the scop.
--- Comment #4 from sebpop at gmail dot com 2008-12-10 22:37 --- Subject: Re: [graphite] The def for a var exists inside one of the scops bb's but an appropriate phi is not created to allow the phi to reach the use of that def ouside the scop. On Wed, Dec 10, 2008 at 4:34 PM, hjagasia at gcc dot gnu dot org [EMAIL PROTECTED] wrote: --- Comment #3 from hjagasia at gcc dot gnu dot org 2008-12-10 22:34 --- Created an attachment (id=16880) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16880action=view) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16880action=view) Updated patch reviewed by Sebastian This looks better thanks. Ok for graphite branch. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38446
[Bug middle-end/38084] [graphite] ICE : in build_graphite_scops, at graphite.c:1829
--- Comment #4 from sebpop at gmail dot com 2008-12-08 22:07 --- Subject: Re: [graphite] ICE : in build_graphite_scops, at graphite.c:1829 On Mon, Dec 8, 2008 at 3:49 PM, grosser at gcc dot gnu dot org [EMAIL PROTECTED] wrote: Fix The patch looks good. Please apply to graphite branch and trunk. For trunk, please also include the new testcases pr38084.c and id-3.f90, and make sure that the patch bootstraps and passes testsuite. Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38084
[Bug bootstrap/38262] [4.4 regression] GCC components unnecessarily link with shared gmp/mpfr
--- Comment #7 from sebpop at gmail dot com 2008-12-05 06:34 --- Subject: Re: [4.4 regression] GCC components unnecessarily link with shared gmp/mpfr On Fri, Nov 28, 2008 at 3:38 AM, jakub at gcc dot gnu dot org [EMAIL PROTECTED] wrote: The patch looks good to me (if not obvious). Sebastian, are you going to post it to gcc-patches? I just sent it to gcc-patches. Sorry this took me so long to send: it went out of my radar. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38262
[Bug middle-end/37951] -ftree-parallelize-loops=2 fails for MA57
--- Comment #6 from sebpop at gmail dot com 2008-11-26 17:42 --- Subject: Re: -ftree-parallelize-loops=2 fails for MA57 gfortran -O3 -fdump-tree-vect-details -ftree-vectorize -ftree-parallelize-loops=2 -c ma57.f ma57.f: In function 'ma57sd': ma57.f:1538: internal compiler error: Segmentation fault [...] Program received signal SIGSEGV, Segmentation fault. 0xb7ce06f9 in free () from /lib/tls/i686/cmov/libc.so.6 (gdb) bt #0 0xb7ce06f9 in free () from /lib/tls/i686/cmov/libc.so.6 #1 0xb7cdcf43 in _IO_free_backup_area () from /lib/tls/i686/cmov/libc.so.6 #2 0xb7cdafb2 in _IO_file_overflow () from /lib/tls/i686/cmov/libc.so.6 #3 0xb7cda51b in _IO_file_xsputn () from /lib/tls/i686/cmov/libc.so.6 #4 0xb7cb675f in vfprintf () from /lib/tls/i686/cmov/libc.so.6 #5 0xb7cbf2e2 in fprintf () from /lib/tls/i686/cmov/libc.so.6 #6 0x083a634e in vect_print_dump_info (vl=REPORT_DETAILS) at This ICE is different than the one you reported first: this fails in the debug dumps function. Could you report the backtrace using the following flags: gfortran -O3 -ftree-vectorize -ftree-parallelize-loops=2 -c ma57.f Thank you, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37951
[Bug bootstrap/38262] [4.4 regression] GCC components unnecessarily link with shared gmp/mpfr
--- Comment #3 from sebpop at gmail dot com 2008-11-26 18:20 --- Subject: Re: [4.4 regression] GCC components unnecessarily link with shared gmp/mpfr Thanks for catching the missing parts. Here is the updated patch. Does this patch look correct? I sent this patch to test on the gccfarm and will send an email to gcc-patches after it completes regstrap. Thanks, Sebastian On Tue, Nov 25, 2008 at 5:08 PM, ghazi at gcc dot gnu dot org [EMAIL PROTECTED] wrote: --- Comment #2 from ghazi at gcc dot gnu dot org 2008-11-25 23:08 --- (In reply to comment #1) Subject: Re: New: [4.4 regression] GCC components unnecessarily link with shared gmp/mpfr Here is a patch from Dwarak for fixing this. He will send this to review on gcc-patches@ list. Sebastian Pop -- AMD - GNU Tools Thanks, however I don't understand why in this patch xgcc and cpp are being linked with BACKENDLIBS. They don't linked with libbackend.a. Also, there are many more places where you do need to add BACKENDLIBS like cc1plus, cc1obj, f951, jc1, etc. See here for all the places you'll need to catch: http://gcc.gnu.org/ml/gcc-patches/2008-02/msg00187.html -- ghazi at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2008-11-25 23:08:54 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38262 --- Comment #4 from sebpop at gmail dot com 2008-11-26 18:20 --- Created an attachment (id=16780) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16780action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38262
[Bug bootstrap/38262] [4.4 regression] GCC components unnecessarily link with shared gmp/mpfr
--- Comment #1 from sebpop at gmail dot com 2008-11-25 20:25 --- Subject: Re: New: [4.4 regression] GCC components unnecessarily link with shared gmp/mpfr Here is a patch from Dwarak for fixing this. He will send this to review on gcc-patches@ list. Sebastian Pop -- AMD - GNU Tools --- Makefile.in 2008-10-23 10:33:51.274495000 -0500 +++ Makefile.in.fix 2008-11-19 16:11:55.80298 -0600 @@ -903,8 +903,9 @@ BUILD_LIBDEPS=3D $(BUILD_LIBIBERTY) # How to link with both our special library facilities # and the system's installed libraries. -LIBS =3D @LIBS@ $(CPPLIB) $(LIBINTL) $(LIBICONV) $(LIBIBERTY) $(LIBDECNUMBER) \ - $(GMPLIBS) $(CLOOGLIBS) $(PPLLIBS) +LIBS =3D @LIBS@ $(CPPLIB) $(LIBINTL) $(LIBICONV) $(LIBIBERTY) $(LIBDECNUMBER)=20 + +BACKENDLIBS =3D $(GMPLIBS) $(CLOOGLIBS) $(PPLLIBS) # Any system libraries needed just for GNAT. SYSLIBS =3D @GNAT_LIBEXC@ @@ -1613,7 +1614,7 @@ libbackend.a: $([EMAIL PROTECTED]@) xgcc$(exeext): $(GCC_OBJS) gccspec.o version.o intl.o prefix.o \ version.o $(LIBDEPS) $(EXTRA_GCC_OBJS) $(CC) $(ALL_CFLAGS) $(LDFLAGS) -o $@ $(GCC_OBJS) gccspec.o \ - intl.o prefix.o version.o $(EXTRA_GCC_OBJS) $(LIBS) + intl.o prefix.o version.o $(EXTRA_GCC_OBJS) $(LIBS) $(BACKENDLIBS) # cpp is to cpp0 as gcc is to cc1. # The only difference from xgcc is that it's linked with cppspec.o @@ -1621,7 +1622,7 @@ xgcc$(exeext): $(GCC_OBJS) gccspec.o ver cpp$(exeext): $(GCC_OBJS) cppspec.o version.o intl.o prefix.o \ version.o $(LIBDEPS) $(EXTRA_GCC_OBJS) $(CC) $(ALL_CFLAGS) $(LDFLAGS) -o $@ $(GCC_OBJS) cppspec.o \ - intl.o prefix.o version.o $(EXTRA_GCC_OBJS) $(LIBS) + intl.o prefix.o version.o $(EXTRA_GCC_OBJS) $(LIBS) $(BACKENDLIBS) # Dump a specs file to make -B./ read these specs over installed ones. $(SPECS): xgcc$(exeext) @@ -1638,7 +1639,7 @@ dummy-checksum.o : dummy-checksum.c cc1-dummy$(exeext): $(C_OBJS) dummy-checksum.o $(BACKEND) $(LIBDEPS) $(CC) $(ALL_CFLAGS) $(LDFLAGS) -o $@ $(C_OBJS) dummy-checksum.o \ - $(BACKEND) $(LIBS) $(GMPLIBS) + $(BACKEND) $(LIBS) $(BACKENDLIBS) cc1-checksum.c : cc1-dummy$(exeext) build/genchecksum$(build_exeext) build/genchecksum$(build_exeext) cc1-dummy$(exeext) $@ @@ -1647,7 +1648,7 @@ cc1-checksum.o : cc1-checksum.c cc1$(exeext): $(C_OBJS) cc1-checksum.o $(BACKEND) $(LIBDEPS) $(CC) $(ALL_CFLAGS) $(LDFLAGS) -o $@ $(C_OBJS) cc1-checksum.o \ - $(BACKEND) $(LIBS) $(GMPLIBS) + $(BACKEND) $(LIBS) $(BACKENDLIBS) # # Build libgcc.a. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38262
[Bug middle-end/38250] ICE with -O2 -ftree-loop-distribution
--- Comment #4 from sebpop at gmail dot com 2008-11-24 17:27 --- Subject: Re: ICE with -O2 -ftree-loop-distribution The patch looks good. Please test and ask for approval to commit to trunk on [EMAIL PROTECTED] Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38250
[Bug fortran/38044] -O1/-O2/-O3 -fgraphite-identity causes ICE when compiling induct.f90 Polyhedron 2005 benchmark
--- Comment #3 from sebpop at gmail dot com 2008-11-07 02:49 --- Subject: Re: -O1/-O2/-O3 -fgraphite-identity causes ICE when compiling induct.f90 Polyhedron 2005 benchmark On Thu, Nov 6, 2008 at 8:41 PM, howarth at nitro dot med dot uc dot edu [EMAIL PROTECTED] wrote: but no ICE at -O0 -fgraphite-identity. That's because at -O0 we don't even go in SSA form, and there are no loop transforms performed. Also have you checked that all the bugs that you have opened are not duplicates of other bugs already reported? See http://gcc.gnu.org/wiki/Graphite for a list of open bugs. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38044
[Bug middle-end/37379] [graphite] ICE compiling aermod.f90 with -ffast-math -floop-block -O2 -fgraphite
--- Comment #6 from sebpop at gmail dot com 2008-11-07 05:21 --- Subject: Re: [graphite] ICE compiling aermod.f90 with -ffast-math -floop-block -O2 -fgraphite Hi, For the first part of the bug: aermod.f90:14521: internal compiler error: in instantiate_scev_1, at tree-scalar-evolution.c:2220 the bug was introduced by an automatic rewrite arount TREE_CODE_LENGTH http://gcc.gnu.org/viewcvs?view=revrevision=122018 The fix avoids the gcc_assert by returning unknown scalar evolution. The second part of the bug was already fixed: aermod.f90:8312: internal compiler error: in expand_scalar_variables_expr, at graphite.c:3168 I will apply the patch below once it finishes regstrap. Sebastian Index: tree-scalar-evolution.c === --- tree-scalar-evolution.c (revision 141661) +++ tree-scalar-evolution.c (working copy) @@ -2213,7 +2213,9 @@ instantiate_scev_1 (basic_block instanti break; } - gcc_assert (!VL_EXP_CLASS_P (chrec)); + if (VL_EXP_CLASS_P (chrec)) +return chrec_dont_know; + switch (TREE_CODE_LENGTH (TREE_CODE (chrec))) { case 3: -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37379
[Bug middle-end/37883] [graphite] ICE : in scan_tree_for_params, at graphite.c:2274
--- Comment #4 from sebpop at gmail dot com 2008-11-04 23:34 --- Subject: Re: [graphite] ICE : in scan_tree_for_params, at graphite.c:2274 It seems PLUS_EXPR and POINTER_PLUS_EXPR can really handled identically. So I will like to commit this patch. Yes they should be handled in the same way in this context. Please install the patch. Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37883
[Bug tree-optimization/37573] [4.4 Regression] gcc-4.4 regression: incorrect code generation with -O1 -ftree-vectorize
--- Comment #12 from sebpop at gmail dot com 2008-10-22 16:10 --- Subject: Re: [4.4 Regression] gcc-4.4 regression: incorrect code generation with -O1 -ftree-vectorize common base. Consider s.c[1] and s + i, obviously the accesses can overlap - would you still say so if the base address of the first one would be s.c[0]? Yes, in the case s.c[1] versus s.c[0], we still have to consider the arrays to potentially overlap. (really the base address of a non-variable access is the access itself, right? s.c[1] in this case) No, it cannot be s.c[1] here. The base object for arrays in structs should be the struct itself. The base address tells you what memory object is accessed with an offset. For structs, you are allowed to access any of their contents using arithmetic. For instance in: struct s { int a[2]; int c[20]; } you could access s.c[10] from the address of struct s with: s.a + 12. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37573
[Bug tree-optimization/37891] [graphite-branch] Invalid use of single_succ_edge in create_single_entry_edge
--- Comment #1 from sebpop at gmail dot com 2008-10-22 18:04 --- Subject: Re: New: [graphite-branch] Invalid use of single_succ_edge in create_single_entry_edge Commit 141283 introduced new code in create_single_entry_edge, that breaks polyhedron in linpk.f90, mdbx.f90, protein.f90, rnflow.f90, test_fpu.f90: Let's fix this bug by reverting that patch: the fix for the other bug should be in split_block instead of in graphite.c. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37891
[Bug bootstrap/36908] bootstrap forever with BOOT_CFLAGS=-O2 -ftree-loop-distribution
--- Comment #5 from sebpop at gmail dot com 2008-10-22 21:08 --- Subject: Re: bootstrap forever with BOOT_CFLAGS=-O2 -ftree-loop-distribution Sebastian, can you please look at this? Sorry for having missed this bug. The problem here is that we end with two identical loops, as we copy almost all the statements in both loops. The attached patch solves the problem by counting the number of memory read and write operations per partition and compares it to the total number of memory operations in the Reduced Dependence Graph. Loop distribution is stopped when one of the partitions contains all the memory ops. Okay for trunk once it finishes regstrap? Thanks, Sebastian --- Comment #6 from sebpop at gmail dot com 2008-10-22 21:08 --- Created an attachment (id=16529) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16529action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36908
[Bug middle-end/37886] [graphite] ICE: Segmentation fault
--- Comment #4 from sebpop at gmail dot com 2008-10-23 01:31 --- Subject: Re: [graphite] ICE: Segmentation fault On Wed, Oct 22, 2008 at 7:28 PM, grosser at gcc dot gnu dot org [EMAIL PROTECTED] wrote: Proposed fix in gloog() I added a fix for this SEGFAULT. But now we fail with: copy_data.c: In function 'copy_data': copy_data.c:1: internal compiler error: in expand_scalar_variables_expr, at graphite.c:3617 This is handled in Bug 37851. I would like to commit this fix and close this bug. The fix looks good. Please apply. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37886
[Bug tree-optimization/37686] [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE.
--- Comment #19 from sebpop at gmail dot com 2008-10-03 20:15 --- Subject: Re: [4.4 Regression] Building of CPU2000's bzip2 with peak flags with -mcpu=power4 fails with an ICE. Here is a patch that should fix this bug. Can somebody test that it fixes it? Thanks, Sebastian --- Comment #20 from sebpop at gmail dot com 2008-10-03 20:15 --- Created an attachment (id=16459) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16459action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
[Bug tree-optimization/37690] typo in the example for -floop-strip-mine
--- Comment #2 from sebpop at gmail dot com 2008-10-02 06:18 --- Subject: Re: typo in the example for -floop-strip-mine The patch looks good. Please install. I also have installed a similar patch in htdocs/gcc-4.4/changes.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37690
[Bug middle-end/37372] [graphite] SCoP detection splits bbs / Define SCoPs with single entry and exit edge
--- Comment #8 from sebpop at gmail dot com 2008-09-29 14:46 --- Subject: Re: [graphite] SCoP detection splits bbs / Define SCoPs with single entry and exit edge --- Comment #7 from grosser at gcc dot gnu dot org 2008-09-29 13:14 --- Committed SVN 140746. I see that in http://gcc.gnu.org/viewcvs?view=revrevision=140746 you forgot to include in the changelog a line like this: PR tree-optimization/37372 that would have automatically included the commit message in the bugzilla bug. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37372
[Bug objc++/37335] Boostrap failed on obj-c++ too many arguments to function 'build_array_ref'
--- Comment #2 from sebpop at gmail dot com 2008-09-02 19:48 --- Subject: Re: Boostrap failed on obj-c++ too many arguments to function 'build_array_ref' On Tue, Sep 2, 2008 at 12:22 PM, 3dw4rd at verizon dot net [EMAIL PROTECTED] wrote: Graphite just went it. I might just wait till the turbulence dies down and try again. This should be another problem: graphite has not touched the code of build_array_ref, nor the code of gcc/objc/objc-act.c Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37335
[Bug tree-optimization/36287] [4.4 Regression] ICE with -O -ftree-loop-linear
--- Comment #3 from sebpop at gmail dot com 2008-05-21 16:31 --- Subject: Re: [4.4 Regression] ICE with -O -ftree-loop-linear Sebastian, that was your change. http://gcc.gnu.org/viewcvs?view=revrevision=135672 was a clean-up of the lambda framework. I'm working on a fix. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36287
[Bug tree-optimization/36287] [4.4 Regression] ICE with -O -ftree-loop-linear
--- Comment #4 from sebpop at gmail dot com 2008-05-21 18:49 --- Subject: Re: [4.4 Regression] ICE with -O -ftree-loop-linear Fix attached: that's a bad typo. This also fixes PR36286. Sent to regstrap on gccfarm. I will commit it just after it passes. Sebastian --- Comment #5 from sebpop at gmail dot com 2008-05-21 18:49 --- Created an attachment (id=15667) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15667action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36287
[Bug tree-optimization/36228] redundant runtime check while vectorizing
--- Comment #3 from sebpop at gmail dot com 2008-05-15 21:55 --- Subject: Re: redundant runtime check while vectorizing Here is a patch for this: a first data dependence test a call to operand_equal_p on the array references Tested with vect.exp and tree-ssa.exp. I will send another email to gcc-patches with the patch once it passes the tests on gccfarm. Sebastian --- Comment #4 from sebpop at gmail dot com 2008-05-15 21:55 --- Created an attachment (id=15643) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15643action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36228
[Bug tree-optimization/35011] [4.3 regression] ICE with -fcheck-data-deps
--- Comment #2 from sebpop at gmail dot com 2008-01-29 17:40 --- Subject: Re: [4.3 regression] ICE with -fcheck-data-deps On 29 Jan 2008 13:34:07 -, jakub at gcc dot gnu dot org [EMAIL PROTECTED] wrote: P4, unless you can reproduce without -fcheck-data-deps. -fcheck-data-deps is a compiler debugging option. Right, this bug is not very important. I will propose a fix for 4.4, and I will also run the specs with this option to stabilize the omega dependence analysis, that should replace a part of the old data dep tester in 4.4. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35011
[Bug tree-optimization/34976] verify_ssa ICE with -ftree-loop-linear
--- Comment #3 from sebpop at gmail dot com 2008-01-27 04:17 --- Subject: Re: verify_ssa ICE with -ftree-loop-linear Patch: http://gcc.gnu.org/ml/gcc-patches/2008-01/msg01294.html it does not fix really the problem, just works around the problem. See also the comments here: http://gcc.gnu.org/ml/gcc-patches/2008-01/msg01293.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34976
[Bug tree-optimization/23821] [4.0/4.1/4.2/4.3 Regression] DOM and VRP creating harder to optimize code
--- Comment #14 from sebpop at gmail dot com 2008-01-12 00:11 --- Subject: Re: [4.0/4.1/4.2/4.3 Regression] DOM and VRP creating harder to optimize code Patch: http://gcc.gnu.org/ml/gcc-patches/2008-01/msg00518.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23821
[Bug tree-optimization/34679] [4.3 Regression] ICE: tree check: expected integer_type, have enumeral_type in host_integerp, at tree.c:4949 (predictive commoning)
--- Comment #3 from sebpop at gmail dot com 2008-01-09 08:08 --- Subject: Re: [4.3 Regression] ICE: tree check: expected integer_type, have enumeral_type in host_integerp, at tree.c:4949 (predictive commoning) Patch http://gcc.gnu.org/ml/gcc-patches/2008-01/msg00344.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34679
[Bug tree-optimization/34458] [4.3 Regression] ICE in int_cst_value, at tree.c:8047 at -O3
--- Comment #9 from sebpop at gmail dot com 2007-12-28 17:56 --- Subject: Re: [4.3 Regression] ICE in int_cst_value, at tree.c:8047 at -O3 Attached is a fix for this bug. I'll test it and then post it on gcc-patches. Sebastian --- Comment #10 from sebpop at gmail dot com 2007-12-28 17:56 --- Created an attachment (id=14839) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14839action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34458
[Bug tree-optimization/34413] gfortran.dg/ltrans-7.f90 doesn't work
--- Comment #5 from sebpop at gmail dot com 2007-12-16 08:31 --- Subject: Re: gfortran.dg/ltrans-7.f90 doesn't work On 16 Dec 2007 03:23:17 -, jvdelisle at gcc dot gnu dot org [EMAIL PROTECTED] wrote: --- Comment #4 from jvdelisle at gcc dot gnu dot org 2007-12-16 03:23 --- I think Sebastian committed this patch with the intent to fix the bug. Usually we don't commit test cases until after we have it fixed and use the PR to track the issue. So I think we just wait for Sebastian to finish. If its going to be a while, we can XFAIL it. Is this correct Sebastian? Yes, I'm going to fix this tomorrow, either xfailing the test or better, fix the testcase with a patch for loop-linear. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34413
[Bug tree-optimization/19097] [4.1/4.2/4.3 regression] Quadratic behavior with many sets for the same register in VRP
--- Comment #44 from sebpop at gmail dot com 2007-11-11 05:16 --- Subject: Re: [4.1/4.2/4.3 regression] Quadratic behavior with many sets for the same register in VRP IMVHO this should be closed as WONTFIX. Steven, why isn't your patch from comment #37 not a candidate for fixing this bug? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19097
[Bug tree-optimization/28868] [4.0/4.1/4.2/4.3 Regression] Not eliminating the PHIs which have the same arguments
--- Comment #12 from sebpop at gmail dot com 2007-11-05 06:13 --- Subject: Re: [4.0/4.1/4.2/4.3 Regression] Not elimintating the PHIs which have the same arguments Replacing ssa names with other ssa names willy-nilly is not always a win. We eventually ended up with heuristics to not change loop depths of ssa names, etc. See also PR23821, where we reach the exact same conclusion: DOM and VRP are playing the replace SSA_NAMEs game, and we're losing to this game as the substitution is done randomly... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28868
[Bug tree-optimization/33319] [4.2/4.3 regression] g++.dg/tree-ssa/pr27549.C ICE with vectorization
--- Comment #9 from sebpop at gmail dot com 2007-11-03 20:39 --- Subject: Re: [4.2/4.3 regression] g++.dg/tree-ssa/pr27549.C ICE with vectorization I cannot reproduce the bug on i686-linux either. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33319
[Bug tree-optimization/32540] [4.3 Regression] Exponential time behavior in PRE
--- Comment #13 from sebpop at gmail dot com 2007-11-03 05:19 --- Subject: Re: [4.3 Regression] Exponential time behavior in PRE Yes, the heuristics can sometimes generate a very large number of copies to eliminate a single redundancy. This is jsut the way the standard PRE heuristics work. If you want to try to come up with a better one, you are welcome to :) What about stopping the computation when we see that there are too many values that are anticipable? Here is a patch that restores the compile time on all the reported testcases. The constant should be a param, and the default value should be higher probably. Index: tree-ssa-pre.c === --- tree-ssa-pre.c (revision 129775) +++ tree-ssa-pre.c (working copy) @@ -1847,6 +1847,13 @@ compute_partial_antic_aux (basic_block b if (block_has_abnormal_pred_edge) goto maybe_dump_sets; + /* If there are too many partially anticipatable values in the + block, phi_translate_set can take an exponential time: stop + before the translation starts. */ + if (single_succ_p (block) + bitmap_count_bits (PA_IN (single_succ (block))-expressions) 10) +goto maybe_dump_sets; + old_PA_IN = PA_IN (block); PA_OUT = bitmap_set_new (); -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32540
[Bug tree-optimization/32540] [4.3 Regression] Exponential time behavior in PRE
--- Comment #14 from sebpop at gmail dot com 2007-11-03 05:26 --- Subject: Re: [4.3 Regression] Exponential time behavior in PRE With the patch, compile time goes down also for PR33922. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32540
[Bug tree-optimization/32540] [4.3 Regression] Exponential time behavior in PRE
--- Comment #15 from sebpop at gmail dot com 2007-11-03 05:54 --- Subject: Re: [4.3 Regression] Exponential time behavior in PRE And I just saw that there is already a patch for this bug attached unfortunately to PR32575. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32540
[Bug tree-optimization/33707] missed optimization with dependency checker
--- Comment #1 from sebpop at gmail dot com 2007-11-03 06:31 --- Subject: Re: New: missed optimization with dependency checker int foo (char *a, unsigned n) { int i; a[0] = 0; for (i = 16; i n; i++) a[i] = a[i-16]; } We're failing to analyse the base of the array 'a' for this code, as there is a cast from signed int to unsigned int for the main iv: # i.0D.1181_20 = PHI i.0D.1181_4(5), 16(3) # iD.1177_19 = PHI iD.1177_12(5), 16(3) D.1182_7 = aD.1173_2(D) + i.0D.1181_20; iD.1177_12 = iD.1177_19 + 1; i.0D.1181_4 = (unsigned intD.3) iD.1177_12; if (i.0D.1181_4 nD.1174_5(D)) This is due to the fact that we have to convert 'i' to unsigned before comparing with 'n'. The exact same testcase with just a signed type for 'n' is vectorized: int foo (char *a, int n) { int i; a[0] = 0; for (i = 16; i n; i++) a[i] = a[i-16]; } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33707
[Bug tree-optimization/33113] Failing to represent the stride (with array) of a dataref when it is not a constant
--- Comment #5 from sebpop at gmail dot com 2007-10-31 23:43 --- Subject: Re: Failing to represent the stride (with array) of a dataref when it is not a constant Making us return symbolic stride would not be hard. The problem is that data dependence analysis would fail anyway, True, and the vectorizer has also to be fixed to not consider INTEGER_CST strides only. sometimes (not in this testcases) there won't be a need for dependence testing - e.g. a reduction computation where there are no stores, or initialization with a constant (i.e. a store and no loads), so there's already a value in doing this. This patch would let symbolic non integer_cst steps to be computed, and stored in DR_STEP: Index: tree-data-ref.c === --- tree-data-ref.c (revision 129797) +++ tree-data-ref.c (working copy) @@ -657,7 +657,7 @@ dr_analyze_innermost (struct data_refere offset_iv.base = ssize_int (0); offset_iv.step = ssize_int (0); } - else if (!simple_iv (loop, stmt, poffset, offset_iv, false)) + else if (!simple_iv (loop, stmt, poffset, offset_iv, true)) { if (dump_file (dump_flags TDF_DETAILS)) fprintf (dump_file, failed: evolution of offset is not affine.\n); but the problem is that in the vectorizer, DR_STEP has to be an INTEGER_CST: for instance, step = TREE_INT_CST_LOW (DR_STEP (dra)); ... || tree_int_cst_compare (DR_STEP (dra), DR_STEP (drb))) and plenty of other places will ICE if we feed them with symbolic strides. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33113
[Bug tree-optimization/32375] not vectorized: can't determine dependence (array sections)
--- Comment #6 from sebpop at gmail dot com 2007-10-30 17:59 --- Subject: Re: not vectorized: can't determine dependence (array sections) I would like to keep the two bugs, PR32375 and PR32378, open as we can vectorize them without having to version the code. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32375