https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #37 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 
---
(In reply to Joost VandeVondele from comment #36)
> #pragma GCC optimize ( "-Ofast -fvariable-expansion-in-unroller
> -funroll-loops" )

and really beneficial for larger matrices would be 

-floop-nest-optimize

in particular the blocking (it would be an additional motivation for PR14741
and work on graphite in general), don't know if one can give the parameter for
the blocking. In principle the loop-nest-optimization, together with the -Ofast
(and ideally -march=native, which we can't have in libgfortran, I assume) would
yield near peak performance.

Reply via email to