[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 Alexander Monakov changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |INVALID --- Comment #14 from Alexander Monakov --- Ah, the linked report actually says very clearly that fixes landed in Glibc 2.25, so I'll close this bug: nothing to do on GCC side about this.
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #13 from Alexander Monakov --- This is most likely a variant of https://bugzilla.redhat.com/show_bug.cgi?id=1421121 so hitting this bug requires a specific CPU model. It looks as if SSE-AVX transition penalties appear when switching between pure-SSE sinf code and VEX-prefixed SSE code in the main program after the ld.so runtime resolver affects AVX state tracking in the CPU. I'm not sure if any patches have landed on Glibc side to avoid this, but in any case this should be re-reported against Glibc if needed, GCC cannot improve the situation. An easy workaround would be to pass -Wl,-z,now when linking.
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #12 from Martin Reinecke--- Created attachment 43961 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43961=edit perf annotate output with -march=native
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #11 from Martin Reinecke--- Created attachment 43960 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43960=edit perf annotate output without -march=native
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #10 from Martin Liška --- And please rebuild the binaries with -g and attach perf annotate output. Thanks.
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #9 from Martin Reinecke--- Sure! martin@martin-Latitude-E7450 ~/tmp $ gcc -O3 testcase2.c -lm martin@martin-Latitude-E7450 ~/tmp $ perf stat ./a.out Performance counter stats for './a.out': 1109,985866 task-clock (msec) #0,999 CPUs utilized 2 context-switches #0,002 K/sec 0 cpu-migrations#0,000 K/sec 2.000 page-faults #0,002 M/sec 3.155.388.780 cycles#2,843 GHz 6.717.336.961 instructions #2,13 insn per cycle 979.526.022 branches # 882,467 M/sec 38.112 branch-misses #0,00% of all branches 1,110639187 seconds time elapsed martin@martin-Latitude-E7450 ~/tmp $ gcc -O3 testcase2.c -march=native -lm martin@martin-Latitude-E7450 ~/tmp $ perf stat ./a.out Performance counter stats for './a.out': 7724,004864 task-clock (msec) #1,000 CPUs utilized 86 context-switches #0,011 K/sec 1 cpu-migrations#0,000 K/sec 2.004 page-faults #0,259 K/sec 22.129.645.853 cycles#2,865 GHz 6.723.657.441 instructions #0,30 insn per cycle 980.761.202 branches # 126,976 M/sec 171.813 branch-misses #0,02% of all branches 7,726058359 seconds time elapsed
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Comment #8 from Alexander Monakov --- Can you also run the tests under 'perf stat'?
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #7 from Martin Reinecke--- Here is the output of --verbose: martin@martin-Latitude-E7450 ~/tmp $ gcc -O3 testcase2.c --verbose -lm -S Using built-in specs. COLLECT_GCC=gcc Target: x86_64-pc-linux-gnu Configured with: /home/martin/codes/gccgit/configure --disable-bootstrap --disable-multilib --prefix=/home/martin/codes/utrunk --enable-languages=c++,fortran --enable-target=all --enable-checking=release Thread model: posix gcc version 8.0.1 20180411 (experimental) [trunk revision b3ed066d3a5:895af22275d:7d24c3846c904b6e1ffea0bee0c58a9f7bcc23cb] (GCC) COLLECT_GCC_OPTIONS='-O3' '-v' '-S' '-mtune=generic' '-march=x86-64' /home/martin/codes/utrunk/libexec/gcc/x86_64-pc-linux-gnu/8.0.1/cc1 -quiet -v -imultiarch x86_64-linux-gnu testcase2.c -quiet -dumpbase testcase2.c -mtune=generic -march=x86-64 -auxbase testcase2 -O3 -version -o testcase2.s GNU C17 (GCC) version 8.0.1 20180411 (experimental) [trunk revision b3ed066d3a5:895af22275d:7d24c3846c904b6e1ffea0bee0c58a9f7bcc23cb] (x86_64-pc-linux-gnu) compiled by GNU C version 5.4.0 20160609, GMP version 6.1.0, MPFR version 3.1.4, MPC version 1.0.3, isl version isl-0.16.1-GMP GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu" ignoring nonexistent directory "/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/../../../../x86_64-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/include /usr/local/include /home/martin/codes/utrunk/include /home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/include-fixed /usr/include/x86_64-linux-gnu /usr/include End of search list. GNU C17 (GCC) version 8.0.1 20180411 (experimental) [trunk revision b3ed066d3a5:895af22275d:7d24c3846c904b6e1ffea0bee0c58a9f7bcc23cb] (x86_64-pc-linux-gnu) compiled by GNU C version 5.4.0 20160609, GMP version 6.1.0, MPFR version 3.1.4, MPC version 1.0.3, isl version isl-0.16.1-GMP GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 08c992208389b44fd477c64ce2b4d084 COMPILER_PATH=/home/martin/codes/utrunk/libexec/gcc/x86_64-pc-linux-gnu/8.0.1/:/home/martin/codes/utrunk/libexec/gcc/x86_64-pc-linux-gnu/8.0.1/:/home/martin/codes/utrunk/libexec/gcc/x86_64-pc-linux-gnu/:/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/:/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/ LIBRARY_PATH=/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/:/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/../../../../lib64/:/lib/x86_64-linux-gnu/:/lib/../lib64/:/usr/lib/x86_64-linux-gnu/:/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-O3' '-v' '-S' '-mtune=generic' '-march=x86-64'
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #6 from Martin Reinecke--- Created attachment 43958 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43958=edit assembler output with -march=native
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #5 from Martin Reinecke--- Created attachment 43957 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43957=edit assembler output without -march=native
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #4 from Martin Reinecke--- Created attachment 43956 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43956=edit reduced test case
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #3 from Martin Liška --- And please output of adding --verbose option.
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2018-04-17 CC||marxin at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Martin Liška --- Can't reproduce on Haswell i7 CPU. Can you Martin please attach assembly (-S) for both native and not native build?
[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416 --- Comment #1 from Martin Reinecke--- Just re-tested on an Intel Core i5-4570; on this CPU, there is no performance degradation.