On Thu, Feb 3, 2011 at 6:43 AM, Jack Howarth <howa...@bromo.med.uc.edu> wrote:
> Sebastian,
>   Below are the results for the Polyhedron 2005 benchmarks on
> x86_64-apple-darwin10 using -O3 -ffast-math -funroll-loops under gcc
> trunk at r169776, with -fgraphite-identity and with -fgraphite-identity
> -ftree-loop-linear. I am surprised at the absence of any impact from
> -ftree-loop-linear in either run-time or executable size. The increase
> in compile time on some of the benchmarks suggested it was in effect.
> Is this a poor combination of optimizations for -ftree-loop-linear or
> is fortran less effective in using that optimization?

Well, I don't know of any bogously nested (hot) loop in polyhedron, do you?

Richard.

>               Jack
> ps Hopefully when the remaining loop regressions in -fgraphite-identity
> are solved, the graphite results will improve a bit more.
>
> Using built-in specs.
> COLLECT_GCC=gcc-4
> COLLECT_LTO_WRAPPER=/sw/lib/gcc4.6/libexec/gcc/x86_64-apple-darwin10.7.0/4.6.0/lto-wrapper
> Target: x86_64-apple-darwin10.7.0
> Configured with: ../gcc-4.6-20110202/configure --prefix=/sw 
> --prefix=/sw/lib/gcc4.6 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.6/info 
> --with-build-config=bootstrap-lto --enable-stage1-languages=c,lto 
> --enable-languages=c,c++,fortran,lto,objc,obj-c++,java --with-gmp=/sw 
> --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw 
> --with-system-zlib --x-includes=/usr/X11R6/include 
> --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.6 --enable-checking=yes 
> --enable-cloog-backend=isl
> Thread model: posix
> gcc version 4.6.0 20110203 (experimental) (GCC)
>
> command=gfortran -O3 -ffast-math -funroll-loops
>
> Run-time
>           stock   -fgraphite-identity  -fgraphite-identity
>                                          -ftree-loop-linear
>
> ac            8.80         8.80           8.80
> aermod       17.32        17.43          17.43
> air           5.48         5.43           5.44
> capacita     32.45        32.52          32.53
> channel       1.84         1.84           1.84
> doduc        28.30        26.28          26.28
> fatigue       8.13         8.09           8.09
> gas_dyn       4.30         4.32           4.31
> induct       13.07        12.51          12.51
> linpk        15.47        15.41          15.41
> mdbx         11.21        11.21          11.21
> nf           29.91        30.20          30.01
> protein      32.86        32.21          32.20
> rnflow       23.94        24.18          24.17
> test_fpu      8.02         8.05           8.04
> tfft          1.87         1.87           1.87
>
> Compile-time
>           stock   -fgraphite-identity  -fgraphite-identity
>                                          -ftree-loop-linear
>
> ac            2.12          2.12          2.12
> aermod       57.45         59.22         59.30
> air           3.84          4.37          4.93
> capacita      2.82          2.94          3.07
> channel       1.00          1.20          1.33
> doduc         8.57          8.92          8.95
> fatigue       3.19          3.17          3.17
> gas_dyn       5.38          5.57          5.57
> induct        6.59          6.77          8.81
> linpk         1.08          1.33          1.31
> mdbx          2.83          2.92          2.92
> nf            3.09          3.08          3.10
> protein       8.51          8.70          8.67
> rnflow        9.94         10.09         10.09
> test_fpu      7.22          7.24          7.28
> tfft          0.81          0.88          0.83
>
> Executable size
>           stock   -fgraphite-identity  -fgraphite-identity
>                                          -ftree-loop-linear
>
> ac           50976         50976         50976
> aermod     1264832       1268928       1268928
> air          73984         82184         82184
> capacita     77976         77976         77976
> channel      34792         34792         34792
> doduc       193096        193096        193096
> fatigue      86032         86032         86032
> gas_dyn     119704        115608        115608
> induct      174848        174848        174848
> linpk        38648         38648         38648
> mdbx         82072         82072         82072
> nf           75912         71816         71816
> protein     131992        131992        131992
> rnflow      181080        181080        181080
> test_fpu    155048        150952        150952
> tfft         30760         30760         30760
>
>

Reply via email to