[Bug c++/69564] lto and/or C++ make scimark2 LU slower
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69564 --- Comment #5 from vincenzo Innocente --- it is a regression gcc version 4.9.3 (GCC) c++ -Ofast *.c; ./a.out ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. gcc -Ofast *.c; ./a.out c++ -v Composite Score: 2449.06 FFT Mflops: 2046.03(N=1024) SOR Mflops: 1654.04(100 x 100) MonteCarlo: Mflops: 813.44 Sparse matmult Mflops: 2962.08(N=1000, nz=5000) LU Mflops: 4769.72(M=100, N=100) --- gcc -Ofast *.c -lm; ./a.out ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 2475.22 FFT Mflops: 2064.19(N=1024) SOR Mflops: 1633.01(100 x 100) MonteCarlo: Mflops: 810.37 Sparse matmult Mflops: 2970.47(N=1000, nz=5000) LU Mflops: 4898.06(M=100, N=100)
[Bug c++/69564] lto and/or C++ make scimark2 LU slower
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69564 Richard Biener changed: What|Removed |Added CC||jason at gcc dot gnu.org --- Comment #4 from Richard Biener --- (In reply to vincenzo Innocente from comment #3) > > Any reason you are using the c++ driver here? > Because I am interested in C++ performance > never imagined that the c++ front-end could make a difference on such a > code... > From my point of view it is even a more severe regression than just "lto" Yeah, didn't try to figure out whether the C vs. C++ thing is a regression. But I suspect the change to the C++ loop lowering. Certainly needs closer investigation.
[Bug c++/69564] lto and/or C++ make scimark2 LU slower
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69564 --- Comment #3 from vincenzo Innocente --- > Any reason you are using the c++ driver here? Because I am interested in C++ performance never imagined that the c++ front-end could make a difference on such a code... >From my point of view it is even a more severe regression than just "lto"
[Bug c++/69564] lto and/or C++ make scimark2 LU slower
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69564 --- Comment #2 from Richard Biener --- It looks like we get different BB order out of C++ than C but otherwise no real code-differences as far as I can see.
[Bug c++/69564] lto and/or C++ make scimark2 LU slower
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69564 Richard Biener changed: What|Removed |Added Keywords||lto, missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2016-02-01 CC||hubicka at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Component|lto |c++ Summary|lto makes scimark2 LU |lto and/or C++ make |slower |scimark2 LU slower Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Any reason you are using the c++ driver here? I get > gcc-6 -Ofast -flto *.c -lm -B /abuild/rguenther/trunk3-g/gcc > ./a.out ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 1729.02 FFT Mflops: 1247.04(N=1024) SOR Mflops: 1537.70(100 x 100) MonteCarlo: Mflops: 842.21 Sparse matmult Mflops: 1657.86(N=1000, nz=5000) LU Mflops: 3360.29(M=100, N=100) > gcc-6 -Ofast *.c -lm -B /abuild/rguenther/trunk3-g/gcc > ./a.out ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 1645.94 FFT Mflops: 1288.61(N=1024) SOR Mflops: 1471.29(100 x 100) MonteCarlo: Mflops: 459.90 Sparse matmult Mflops: 1665.91(N=1000, nz=5000) LU Mflops: 3343.98(M=100, N=100) Ok, when using g++ to compile things I _do_ get > g++-6 -Ofast -flto *.c -lm -B /abuild/rguenther/trunk3-g/gcc > ./a.out ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 1321.43 FFT Mflops: 1261.86(N=1024) SOR Mflops: 1533.77(100 x 100) MonteCarlo: Mflops: 850.69 Sparse matmult Mflops: 1669.90(N=1000, nz=5000) LU Mflops: 1290.93(M=100, N=100) > g++-6 -Ofast *.c -lm -B /abuild/rguenther/trunk3-g/gcc > ./a.out ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 1492.12 FFT Mflops: 1279.12(N=1024) SOR Mflops: 1479.86(100 x 100) MonteCarlo: Mflops: 433.83 Sparse matmult Mflops: 1637.11(N=1000, nz=5000) LU Mflops: 2630.71(M=100, N=100) So even without LTO I get a hit in using C++ to compile LU. Interesting.