Here is my report on Fortran benchmarking. I compare the trunk dated 20080507 (no revision number, sorry) and the IRA branch rev. 135035. I run the Polyhedron benchmark (http://www.polyhedron.co.uk/polyhedron_benchmark_suite0html) which is probably the most widely used benchmark in the Fortran community. I don't have many points, but they're very well converged (the "standard" parameters were used, which means each test and each set of compilation option is run between 10 and 100 times, until timing standard deviation becomes less than 0.1%).
I compile with -march=native -ffast-math -funroll-loops -O3 and run on a dual-core biprocessor machine with 8GB RAM. /proc/cpuinfo says it's a Dual-Core AMD Opteron(tm) Processor 2220 running around 2.8 GHz. Full timings are below, with a summary here: Overall (judged by geometric mean exec time), IRA introduces a 2.2% regression in execution time (and a 2.7% regression in compilation time, consistent with my previous mail). Using the CB algorithm doesn't change this significantly. The performance regression is mainly due to one testcase, induct, which is taking a 30% hit on IRA. If the performance of that one were the same with IRA than with the old allocator, the switch would be (for this benchmark) performance-neutral. So, I have investigated the case of induct, and I found that with the IRA branch compiler without -fira, it's already 30% slower than with trunk. So, is it an issue with the IRA branch, or has it just not been merged recently and we had a recent great improvement of induct on trunk? I'd appreciate if you could enlighten me on this point. So, other than that small question, everything seems mostly good on the Fortran performance front. Cheers, FX Comparison of execution time (see in fixed-width font): Benchmark Execution time, compared to mainline Name IRA IRA-CB --------- -------- -------- ac +1.59% +6.80% aermod +5.87% +3.14% air -0.33% -0.83% capacita +5.17% +2.58% channel +0.30% 0.00% doduc -3.61% -3.61% fatigue -0.93% -2.67% gas_dyn 0.99% +2.48% induct +30.28% +29.64% linpk -1.80% -1.57% mdbx -2.19% -2.74% nf +0.74% -0.30% protein +1.30% +1.58% rnflow +0.16% +0.22% test_fpu -0.83% -0.39% tfft -0.72% +0.14% ---------------------------------- geometric mean +2.25% +2.16% Detailed timings for mainline: Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 7.36 1175251 11.32 15 0.0938 aermod 90.03 2424785 38.82 14 0.0866 air 6.83 1365405 12.04 19 0.0983 capacita 2.54 1235764 78.93 23 0.0785 channel 1.66 1254613 10.12 19 0.0885 doduc 13.20 1416729 35.21 13 0.0870 fatigue 6.20 1299862 8.60 12 0.0951 gas_dyn 6.45 1269413 10.08 100 0.1026 induct 19.57 1593762 34.38 10 0.0965 linpk 1.43 1162116 26.17 77 0.2626 mdbx 3.37 1192451 16.41 24 0.0939 nf 7.65 1217536 29.72 68 0.1240 protein 12.77 1342400 57.54 10 0.0942 rnflow 12.81 1357019 31.42 12 0.0976 test_fpu 11.78 1331485 18.07 24 0.0879 tfft 1.13 1173880 6.91 24 0.0991 Geometric Mean Execution Time = 20.85 seconds Timing for IRA branch with -fira: Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 6.06 1158971 11.50 15 0.0979 aermod 94.22 2421896 41.10 12 0.0725 air 7.07 1352645 12.00 23 0.0899 capacita 2.89 1221980 83.01 25 0.1860 channel 1.81 1241539 10.15 31 0.0879 doduc 15.20 1404025 33.94 10 0.0628 fatigue 6.17 1273630 8.52 14 0.0966 gas_dyn 7.79 1256267 10.18 32 0.0920 induct 14.28 1567935 44.79 12 0.0772 linpk 1.44 1145546 25.70 77 0.0920 mdbx 3.54 1181755 16.05 15 0.0588 nf 7.73 1205207 29.94 66 0.0890 protein 12.89 1325392 58.29 10 0.0458 rnflow 12.45 1340531 31.47 12 0.0570 test_fpu 12.18 1312704 17.92 58 0.0797 tfft 1.28 1158396 6.86 32 0.0853 Geometric Mean Execution Time = 21.27 seconds Timing for IRA branch with -fira -fira-algorithm=CB: Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 6.33 1158907 12.09 14 0.0943 aermod 89.54 2421640 40.04 14 0.0877 air 7.44 1352613 11.94 30 0.0841 capacita 2.79 1221980 80.97 25 0.2601 channel 1.75 1241411 10.12 24 0.0909 doduc 14.12 1403417 33.94 10 0.0438 fatigue 5.90 1273630 8.37 16 0.0884 gas_dyn 7.01 1256267 10.33 38 0.0855 induct 13.74 1568287 44.57 13 0.0978 linpk 2.50 1145546 25.76 78 0.2625 mdbx 3.53 1181979 15.96 49 0.0619 nf 7.91 1205207 29.63 68 0.1055 protein 12.36 1325264 58.45 10 0.0717 rnflow 11.78 1340083 31.49 17 0.0892 test_fpu 11.49 1311040 18.00 18 0.0615 tfft 1.24 1158492 6.92 25 0.0807 Geometric Mean Execution Time = 21.25 seconds -- FX Coudert http://www.homepages.ucl.ac.uk/~uccafco/