https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84327

--- Comment #1 from xyzdr4gon333 at googlemail dot com ---
This bug becomes more important for the actual real-life example which becomes
slower at -O2 compared to -O1! Actually in the earlier attached file you only
have to replace the `interleaveZeros` function with this one:

unsigned int interleaveTwoZeros( unsigned int n )
{
    n&= 0x000003ff;
    n = (n ^ (n << 16)) & 0xFF0000FF;
    n = (n ^ (n <<  8)) & 0x0300F00F;
    n = (n ^ (n <<  4)) & 0x030C30C3;
    n = (n ^ (n <<  2)) & 0x09249249;
    return n;
}

I.e. the only difference are slightly different constants, nothing else! The
timings:

        1234567890 iterations took 19.151s and resulted in 806157809
    -O0 1234567890 iterations took 19.1547s and resulted in 1772082360
    -O1 1234567890 iterations took 5.69619s and resulted in 2085417644
    -O2 1234567890 iterations took 6.21504s and resulted in 32256352
    -O3 1234567890 iterations took 6.14414s and resulted in 357018037

Not sure if this is worth another bug. Can reproduce this for the following
compiler versions:

for GPP in g++-4.9 g++-5 g++-6 g++-7 g++-8; do 
    $GPP --version | head -1
    for flag in '   ' -O0 -O1 -O2 -O3; do 
        echo -n "$flag "
        $GPP $flag -std=c++11 optimizeFlags.cpp && ./a.out
    done
done

    g++-4.9 (Debian 4.9.4-2) 4.9.4
        1234567890 iterations took 19.1979s and resulted in 1918993912
    -O0 1234567890 iterations took 19.1785s and resulted in 710267642
    -O1 1234567890 iterations took 5.6609s and resulted in 1898524753
    -O2 1234567890 iterations took 5.71375s and resulted in 1117037030
    -O3 1234567890 iterations took 5.67933s and resulted in 1451088646
    g++-5 (Debian 5.5.0-8) 5.5.0 20171010
        1234567890 iterations took 19.2387s and resulted in 999898210
    -O0 1234567890 iterations took 19.1464s and resulted in 1358121256
    -O1 1234567890 iterations took 5.64181s and resulted in 642760018
    -O2 1234567890 iterations took 5.65094s and resulted in 191105767
    -O3 1234567890 iterations took 5.68849s and resulted in 1555980094
    g++-6 (Debian 6.4.0-12) 6.4.0 20180123
        1234567890 iterations took 19.1786s and resulted in 1613186065
    -O0 1234567890 iterations took 19.2001s and resulted in 424276129
    -O1 1234567890 iterations took 5.73263s and resulted in 1828427433
    -O2 1234567890 iterations took 6.16005s and resulted in 814826690
    -O3 1234567890 iterations took 6.1438s and resulted in 867162058
    g++-7 (Debian 7.3.0-3) 7.3.0
        1234567890 iterations took 19.1302s and resulted in 1147954921
    -O0 1234567890 iterations took 19.1694s and resulted in 734785107
    -O1 1234567890 iterations took 5.72652s and resulted in 1133709951
    -O2 1234567890 iterations took 6.15633s and resulted in 352136223
    -O3 1234567890 iterations took 6.14089s and resulted in 1468150013
    g++-8 (Debian 8-20180207-2) 8.0.1 20180207 (experimental) [trunk revision
257435]
        1234567890 iterations took 19.1278s and resulted in 694826541
    -O0 1234567890 iterations took 19.1454s and resulted in 249938642
    -O1 1234567890 iterations took 5.72959s and resulted in 365780913
    -O2 1234567890 iterations took 6.20064s and resulted in 2033700921
    -O3 1234567890 iterations took 6.12829s and resulted in 1244532281

=> seems like this is somehow a regression bug since g++ 6!

Actually a mix of -O1 with the additional O2-flags seems to work to reproduce
the weird slowdown! 

   g++ -O1 "${O2Flags[@]}" -std=c++11 optimizeFlags.cpp && ./a.out
     => 6.16161s

Actually by bisecting the additional O2-flags this can be traced down to
-finline-small-functions ... I will open another bug for this.

Reply via email to