------- Comment #9 from dominiq at lps dot ens dot fr  2010-08-24 11:47 -------
> Do you see the slowdown as well if you drop -funroll-loops?  

Yes

[macbook] lin/test% gfc -Ofast test_fpu_red.f90
[macbook] lin/test% time a.out
Test1 - Gauss 2000 (101x101) inverts  3.0 sec  Err= 0.000000000000006
3.208u 0.072s 0:03.28 99.6%     0+0k 0+0io 0pf+0w
[macbook] lin/test% gfcp -Ofast test_fpu_red.f90
[macbook] lin/test% time a.out
Test1 - Gauss 2000 (101x101) inverts  2.2 sec  Err= 0.000000000000006
2.440u 0.076s 0:02.52 99.6%     0+0k 0+0io 0pf+0w

> Do you see the slowdown with just -O2?

No

[macbook] lin/test% gfc -O2 test_fpu_red.f90
[macbook] lin/test% time a.out
Test1 - Gauss 2000 (101x101) inverts  3.1 sec  Err= 0.000000000000006
3.328u 0.071s 0:03.40 99.7%     0+0k 0+0io 0pf+0w
[macbook] lin/test% gfcp -O2 test_fpu_red.f90
[macbook] lin/test% time a.out
Test1 - Gauss 2000 (101x101) inverts  3.1 sec  Err= 0.000000000000006
3.330u 0.073s 0:03.40 100.0%    0+0k 0+0io 0pf+0w

but I see it with -O2 -ftree-vectorize

[macbook] lin/test% gfc -O2 -ftree-vectorize test_fpu_red.f90
[macbook] lin/test% time a.out
Test1 - Gauss 2000 (101x101) inverts  3.1 sec  Err= 0.000000000000006
3.318u 0.070s 0:03.39 99.7%     0+0k 0+0io 0pf+0w
[macbook] lin/test% gfcp -O2 -ftree-vectorize test_fpu_red.f90
[macbook] lin/test% time a.out
Test1 - Gauss 2000 (101x101) inverts  2.3 sec  Err= 0.000000000000006
2.498u 0.076s 0:02.57 99.6%     0+0k 0+0io 0pf+0w

although I do not see any difference in the outputs with
-ftree-vectorizer-verbose=2.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45379

Reply via email to