[Bug tree-optimization/34265] Missed optimizations

2011-09-26 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED

[Bug tree-optimization/34265] Missed optimizations

2011-09-17 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265 Dominique d'Humieres dominiq at lps dot ens.fr changed: What|Removed |Added CC||irar at

[Bug tree-optimization/34265] Missed optimizations

2011-09-16 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265 --- Comment #35 from Dominique d'Humieres dominiq at lps dot ens.fr 2011-09-16 15:42:15 UTC --- This pr (as well as pr49006) seems to have been fixed between revisions 176696 and 177649. I am closing pr49006 as fixed and I'll use this pr to

[Bug tree-optimization/34265] Missed optimizations

2011-05-22 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265 --- Comment #34 from Dominique d'Humieres dominiq at lps dot ens.fr 2011-05-22 12:06:20 UTC --- Created attachment 24325 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24325 reduced tests The attached bzipped tar contains the files

[Bug tree-optimization/34265] Missed optimizations

2008-04-23 Thread dominiq at lps dot ens dot fr
--- Comment #33 from dominiq at lps dot ens dot fr 2008-04-23 21:26 --- Created an attachment (id=15523) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15523action=view) induct.f90 variants and their diff with the original file The original diff's have space problems. --

[Bug tree-optimization/34265] Missed optimizations

2007-12-03 Thread dominiq at lps dot ens dot fr
--- Comment #27 from dominiq at lps dot ens dot fr 2007-12-03 14:32 --- I have had a look at the failure of gfortran.dg/array_1.f90 with patch #5. The following reduced code gives the same failure: ! { dg-do run } ! PR 15553 : the array used to be filled with garbage ! this problem

[Bug tree-optimization/34265] Missed optimizations

2007-12-03 Thread dominiq at lps dot ens dot fr
--- Comment #28 from dominiq at lps dot ens dot fr 2007-12-03 14:33 --- Created an attachment (id=14691) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14691action=view) result of -fdump-tree-optimized with patch #5 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265

[Bug tree-optimization/34265] Missed optimizations

2007-12-03 Thread dominiq at lps dot ens dot fr
--- Comment #26 from dominiq at lps dot ens dot fr 2007-12-03 14:08 --- IMO, SLP should vectorize the sequence. Uros, What is the meaning of the above sentence? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265

[Bug tree-optimization/34265] Missed optimizations

2007-12-03 Thread dominiq at lps dot ens dot fr
--- Comment #29 from dominiq at lps dot ens dot fr 2007-12-03 14:34 --- Created an attachment (id=14692) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14692action=view) result of -fdump-tree-optimized without patch #5 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265

[Bug tree-optimization/34265] Missed optimizations

2007-12-03 Thread ubizjak at gmail dot com
--- Comment #30 from ubizjak at gmail dot com 2007-12-03 16:30 --- (In reply to comment #26) IMO, SLP should vectorize the sequence. What is the meaning of the above sentence? Uh, sorry for being terse. If there are no loops, then straight-line parallelization [SLP] should vectorize

[Bug tree-optimization/34265] Missed optimizations

2007-12-03 Thread dominiq at lps dot ens dot fr
--- Comment #31 from dominiq at lps dot ens dot fr 2007-12-03 18:58 --- If there are no loops, then straight-line parallelization [SLP] should vectorize your manually unrolled sequence in comment #24. Yes it should, but if does not after patch #5. The unanswered question so far

[Bug tree-optimization/34265] Missed optimizations

2007-12-03 Thread irar at il dot ibm dot com
--- Comment #32 from irar at il dot ibm dot com 2007-12-04 06:56 --- (In reply to comment #30) Uh, sorry for being terse. If there are no loops, then straight-line parallelization [SLP] should vectorize your manually unrolled sequence in comment #24. Currently only loop-aware SLP

[Bug tree-optimization/34265] Missed optimizations

2007-11-30 Thread ubizjak at gmail dot com
--- Comment #25 from ubizjak at gmail dot com 2007-11-30 21:38 --- (In reply to comment #24) Then the loop is vectorized again. IMO, SLP should vectorize the sequence. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265

[Bug tree-optimization/34265] Missed optimizations

2007-11-29 Thread dominiq at lps dot ens dot fr
--- Comment #16 from dominiq at lps dot ens dot fr 2007-11-29 08:06 --- A quick report of the comparison between the regression results for revision 130500 + patch in comment #5 + Tobias' patch for pr34262 and revision 130489 + some patches applied to rev. 130500. I have the following

[Bug tree-optimization/34265] Missed optimizations

2007-11-29 Thread dominiq at lps dot ens dot fr
--- Comment #18 from dominiq at lps dot ens dot fr 2007-11-29 10:22 --- I have had a look at what's happening for kepler.f90 (from the 2004 polyhedron test suite?) and it looks like another missed vectorization: if I count the mulpd in the kepler.s files, I find 24 before the patch and

[Bug tree-optimization/34265] Missed optimizations

2007-11-29 Thread rguenth at gcc dot gnu dot org
--- Comment #17 from rguenth at gcc dot gnu dot org 2007-11-29 10:11 --- Doh, not only I missed to diff the chunk mentioned in comment #6, but I also added the original unrolling pass, not the one only supposed to unroll inner loops #) So, change the passes.c hunk to Index:

[Bug tree-optimization/34265] Missed optimizations

2007-11-29 Thread dominiq at lps dot ens dot fr
--- Comment #19 from dominiq at lps dot ens dot fr 2007-11-29 10:40 --- Richard, I am not sure to understand your patch in comment #17. I have already in gcc/passes.c (after your patch in comment #5): NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_vrp); NEXT_PASS

[Bug tree-optimization/34265] Missed optimizations

2007-11-29 Thread dominiq at lps dot ens dot fr
--- Comment #20 from dominiq at lps dot ens dot fr 2007-11-29 11:00 --- I have applied my interpretation of the first two changes in comment #17. gfortran.dg/array_1.f90 still abort and induct.v3.f90 is still not vectorized. The good news are that induct.f90 is still properly unrolled

[Bug tree-optimization/34265] Missed optimizations

2007-11-29 Thread rguenther at suse dot de
--- Comment #21 from rguenther at suse dot de 2007-11-29 11:13 --- Subject: Re: Missed optimizations On Thu, 29 Nov 2007, dominiq at lps dot ens dot fr wrote: Richard, I am not sure to understand your patch in comment #17. I have already in gcc/passes.c (after your patch in

[Bug tree-optimization/34265] Missed optimizations

2007-11-29 Thread dominiq at lps dot ens dot fr
--- Comment #22 from dominiq at lps dot ens dot fr 2007-11-29 11:16 --- In top of the first two patches of comment #17, I have MOVED + NEXT_PASS (pass_tree_loop_init); + NEXT_PASS (pass_complete_unrolli); + NEXT_PASS (pass_tree_loop_done); to the first suggested place.

[Bug tree-optimization/34265] Missed optimizations

2007-11-29 Thread dominiq at lps dot ens dot fr
--- Comment #23 from dominiq at lps dot ens dot fr 2007-11-29 12:24 --- In top of the first two patches of comment #17, I have MOVED + NEXT_PASS (pass_tree_loop_init); + NEXT_PASS (pass_complete_unrolli); + NEXT_PASS (pass_tree_loop_done); to the second suggested place.

[Bug tree-optimization/34265] Missed optimizations

2007-11-29 Thread dominiq at lps dot ens dot fr
--- Comment #24 from dominiq at lps dot ens dot fr 2007-11-29 15:49 --- I think I have now a partial understanding of what is happening for the induct variants that do not vectorize with the patch in comment #5: they do not contain any loop inside the k loop. If I replace

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread dominiq at lps dot ens dot fr
--- Comment #1 from dominiq at lps dot ens dot fr 2007-11-28 15:30 --- Created an attachment (id=14654) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14654action=view) Diffs between the original file and the simplest variants In induct.v1.f90 'nominator' and 'denominator' are

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread rguenth at gcc dot gnu dot org
--- Comment #2 from rguenth at gcc dot gnu dot org 2007-11-28 16:06 --- GCC doesn't have a facility to split the inner loop and move it out of the outer loops by introducing a array temporary. As for completely unrolling, this only happens for innermost loops(?) and you can tune the

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread dominiq at lps dot ens dot fr
--- Comment #3 from dominiq at lps dot ens dot fr 2007-11-28 16:14 --- Note that complete unrolling happens too late to help LIM or vectorization. Could this be translated as a YES to my first question: the fortran frontend should unroll computations for short vectors? --

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread rguenth at gcc dot gnu dot org
--- Comment #4 from rguenth at gcc dot gnu dot org 2007-11-28 16:17 --- I would in principle say no - we can instead improve the middle-end here. But it may pay off to not generate a loop for short vectors in case the resulting IL is smaller for example. Of course it would duplicate

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread rguenth at gcc dot gnu dot org
--- Comment #5 from rguenth at gcc dot gnu dot org 2007-11-28 16:33 --- Created an attachment (id=14655) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14655action=view) patch for early complete unrolling of inner loops For example with a patch like this. --

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread dominiq at lps dot ens dot fr
--- Comment #6 from dominiq at lps dot ens dot fr 2007-11-28 18:18 --- Subject: Re: Missed optimizations For example with a patch like this. You also need --- ../_gcc_clean/gcc/tree-flow.h 2007-11-16 16:17:46.0 +0100 +++ ../gcc-4.3-work/gcc/tree-flow.h 2007-11-28

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread dominiq at lps dot ens dot fr
--- Comment #7 from dominiq at lps dot ens dot fr 2007-11-28 18:48 --- Subject: Re: Missed optimizations With your patch the runtime went from 93.670u 0.103s 1:33.85 99.9%0+0k 0+0io 32pf+0w to 38.741u 0.038s 0:38.85 99.7%0+0k 0+1io 32pf+0w Pretty impressive! Note that

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread jb at gcc dot gnu dot org
--- Comment #8 from jb at gcc dot gnu dot org 2007-11-28 20:48 --- The vectorization of dot products is covered by PR31738, I suppose -- jb at gcc dot gnu dot org changed: What|Removed |Added

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread burnus at gcc dot gnu dot org
--- Comment #9 from burnus at gcc dot gnu dot org 2007-11-28 21:27 --- With your patch the runtime went from 93.670u 0.103s 1:33.85 99.9%0+0k 0+0io 32pf+0w to 38.741u 0.038s 0:38.85 99.7%0+0k 0+1io 32pf+0w Thus: 59% faster. Here, it only went ~30% down from 49.89s to

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread rguenth at gcc dot gnu dot org
--- Comment #10 from rguenth at gcc dot gnu dot org 2007-11-28 22:05 --- Indeed - unexpectedly impressive ;) The patch has (obviously) received no tuning as of the placement of the early unrolling in the pass pipeline and early unrolling is only done if that doesn't increase code-size

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread dominiq at lps dot ens dot fr
--- Comment #11 from dominiq at lps dot ens dot fr 2007-11-28 22:35 --- Here are the timings before and after the patch for the polyhedron tests and some variants: Before patch After patch Benchmark Ave Run Number Estim: Ave Run

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread steven at gcc dot gnu dot org
--- Comment #12 from steven at gcc dot gnu dot org 2007-11-28 22:49 --- The only timings significantly changed are actually the compile times, which go up significantly. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread kargl at gcc dot gnu dot org
--- Comment #13 from kargl at gcc dot gnu dot org 2007-11-28 23:06 --- (In reply to comment #12) The only timings significantly changed are actually the compile times, which go up significantly. Look at the kepler execution time. 22.73 s without the patch and 26.11 s with the

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread steven at gcc dot gnu dot org
--- Comment #14 from steven at gcc dot gnu dot org 2007-11-28 23:17 --- Yes, that too. It was more a sarcastic addendum to your remark that there were so few significantly changed numbers. It seemed to me you should not look at just the execution times ;-) --

[Bug tree-optimization/34265] Missed optimizations

2007-11-28 Thread dominiq at lps dot ens dot fr
--- Comment #15 from dominiq at lps dot ens dot fr 2007-11-28 23:57 --- If I am allowed to be sacarstic too, I'll say that the increase in compile time (worst case 11%, arithmetic average 5%) is not against the current trend one can see for instance in