--- Comment #5 from dominiq at lps dot ens dot fr 2010-01-06 19:20 ---
"-O3 -floop-interchange -ftree-loop-distribution" gives also a wrong code. This
pr could be a duplicate of pr42637.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42479
--- Comment #4 from dominiq at lps dot ens dot fr 2010-01-05 14:35 ---
Note that the inner loops in subroutine mutual_ind_quad_rec_coil are not
vectorized at -O3, unless -ffast-math is used. Timing the code with and without
-ffast-math gives
[macbook] lin/test% gfc -O3 induct.f90
[macbo
--- Comment #3 from dominiq at lps dot ens dot fr 2010-01-05 12:56 ---
Profiling without -floop-block
+ 99.8%, start, a.out
| + 99.8%, main, a.out
| | + 99.8%, induct_, a.out
| | | + 77.5%, __mqc_m_MOD_mutual_ind_quad_cir_coil, a.out
| | | | 2.8%, cosisin, libSystem.B.dylib
| | | | -
--- Comment #2 from fxcoudert at gcc dot gnu dot org 2010-01-05 11:42
---
Created an attachment (id=19471)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19471&action=view)
Source file and input file
Compile induct.f90 and run with induct.in in the same directory.
--
http://g
--- Comment #1 from fxcoudert at gcc dot gnu dot org 2010-01-05 11:40
---
Also happens at rev. 155544 on x86_64-unknown-linux-gnu, with both -m32 and
-m64. "-O3 -floop-block" gives different results than "-O3" alone (and is much
faster). Profiling should indicate what part of code is di