http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40958
Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:
What|Removed |Added
CC
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40958
Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:
What|Removed |Added
CC
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51285
--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch
2011-11-24 19:25:06 UTC ---
Simplified testcase showing Tobias patch is unrelated. Is this still triggered
by the same range ?
SUBROUTINE smm_dnn_4_10_10_1_1_2_1(A,B
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51179
--- Comment #8 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch
2011-11-23 08:34:59 UTC ---
(In reply to comment #6)
(if nobody beats me, I'll try to reduce the code and open a new pr).
If reproduced the ICE with 4.7, and started
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51285
Bug #: 51285
Summary: [4.7 Regression] internal compiler error: in
check_loop_closed_ssa_use, at tree-ssa-loop-manip.c
Classification: Unclassified
Product: gcc
Version: 4.7.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51179
--- Comment #9 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch
2011-11-23 17:19:28 UTC ---
(In reply to comment #8)
(In reply to comment #6)
(if nobody beats me, I'll try to reduce the code and open a new pr).
If reproduced
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51285
Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:
What|Removed |Added
CC
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51179
--- Comment #10 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch
2011-11-23 20:11:17 UTC ---
(In reply to comment #1)
What about current 4.7 SVN?
The fastest 4x10 . 10x10 multiply as found with tiny_find.f90 yields somewhat
better
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51179
--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch
2011-11-22 18:34:03 UTC ---
Created attachment 25887
-- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25887
general code
the more general code used to find the most
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51179
--- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch
2011-11-22 18:34:48 UTC ---
(In reply to comment #3)
is IMHO just a matter whether graphite can -floop-interchange this or not.
If you swap manually the l and j
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51179
Bug #: 51179
Summary: poor vectorization on interlagos.
Classification: Unclassified
Product: gcc
Version: 4.6.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119
--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch
2011-11-15 12:19:59 UTC ---
(In reply to comment #1)
I have a cunning plan.
It is doable to come within a factor of 2 of highly efficient implementations
using a cache
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119
--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch
2011-11-15 12:31:10 UTC ---
Created attachment 25826
-- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25826
comparison in performance for small matrix multiplies
701 - 713 of 713 matches
Mail list logo