[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

Alexander Monakov  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #14 from Alexander Monakov  ---
Ah, the linked report actually says very clearly that fixes landed in Glibc
2.25, so I'll close this bug: nothing to do on GCC side about this.

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #13 from Alexander Monakov  ---
This is most likely a variant of 

  https://bugzilla.redhat.com/show_bug.cgi?id=1421121

so hitting this bug requires a specific CPU model.

It looks as if SSE-AVX transition penalties appear when switching between
pure-SSE sinf code and VEX-prefixed SSE code in the main program after the
ld.so runtime resolver affects AVX state tracking in the CPU.

I'm not sure if any patches have landed on Glibc side to avoid this, but in any
case this should be re-reported against Glibc if needed, GCC cannot improve the
situation.

An easy workaround would be to pass -Wl,-z,now when linking.

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread mar...@mpa-garching.mpg.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #12 from Martin Reinecke  ---
Created attachment 43961
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43961=edit
perf annotate output with -march=native

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread mar...@mpa-garching.mpg.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #11 from Martin Reinecke  ---
Created attachment 43960
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43960=edit
perf annotate output without -march=native

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #10 from Martin Liška  ---
And please rebuild the binaries with -g and attach perf annotate output.
Thanks.

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread mar...@mpa-garching.mpg.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #9 from Martin Reinecke  ---
Sure!

martin@martin-Latitude-E7450 ~/tmp $ gcc -O3 testcase2.c -lm
martin@martin-Latitude-E7450 ~/tmp $ perf stat ./a.out

 Performance counter stats for './a.out':

   1109,985866  task-clock (msec) #0,999 CPUs utilized  
 2  context-switches  #0,002 K/sec  
 0  cpu-migrations#0,000 K/sec  
 2.000  page-faults   #0,002 M/sec  
 3.155.388.780  cycles#2,843 GHz
 6.717.336.961  instructions  #2,13  insn per cycle 
   979.526.022  branches  #  882,467 M/sec  
38.112  branch-misses #0,00% of all branches

   1,110639187 seconds time elapsed

martin@martin-Latitude-E7450 ~/tmp $ gcc -O3 testcase2.c -march=native -lm
martin@martin-Latitude-E7450 ~/tmp $ perf stat ./a.out

 Performance counter stats for './a.out':

   7724,004864  task-clock (msec) #1,000 CPUs utilized  
86  context-switches  #0,011 K/sec  
 1  cpu-migrations#0,000 K/sec  
 2.004  page-faults   #0,259 K/sec  
22.129.645.853  cycles#2,865 GHz
 6.723.657.441  instructions  #0,30  insn per cycle 
   980.761.202  branches  #  126,976 M/sec  
   171.813  branch-misses #0,02% of all branches

   7,726058359 seconds time elapsed

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

Alexander Monakov  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #8 from Alexander Monakov  ---
Can you also run the tests under 'perf stat'?

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread mar...@mpa-garching.mpg.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #7 from Martin Reinecke  ---
Here is the output of --verbose:

martin@martin-Latitude-E7450 ~/tmp $ gcc -O3  testcase2.c --verbose -lm -S
Using built-in specs.
COLLECT_GCC=gcc
Target: x86_64-pc-linux-gnu
Configured with: /home/martin/codes/gccgit/configure --disable-bootstrap
--disable-multilib --prefix=/home/martin/codes/utrunk
--enable-languages=c++,fortran --enable-target=all --enable-checking=release
Thread model: posix
gcc version 8.0.1 20180411 (experimental) [trunk revision
b3ed066d3a5:895af22275d:7d24c3846c904b6e1ffea0bee0c58a9f7bcc23cb] (GCC) 
COLLECT_GCC_OPTIONS='-O3' '-v' '-S' '-mtune=generic' '-march=x86-64'
 /home/martin/codes/utrunk/libexec/gcc/x86_64-pc-linux-gnu/8.0.1/cc1 -quiet -v
-imultiarch x86_64-linux-gnu testcase2.c -quiet -dumpbase testcase2.c
-mtune=generic -march=x86-64 -auxbase testcase2 -O3 -version -o testcase2.s
GNU C17 (GCC) version 8.0.1 20180411 (experimental) [trunk revision
b3ed066d3a5:895af22275d:7d24c3846c904b6e1ffea0bee0c58a9f7bcc23cb]
(x86_64-pc-linux-gnu)
compiled by GNU C version 5.4.0 20160609, GMP version 6.1.0, MPFR
version 3.1.4, MPC version 1.0.3, isl version isl-0.16.1-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory
"/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/../../../../x86_64-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/include
 /usr/local/include
 /home/martin/codes/utrunk/include
 /home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C17 (GCC) version 8.0.1 20180411 (experimental) [trunk revision
b3ed066d3a5:895af22275d:7d24c3846c904b6e1ffea0bee0c58a9f7bcc23cb]
(x86_64-pc-linux-gnu)
compiled by GNU C version 5.4.0 20160609, GMP version 6.1.0, MPFR
version 3.1.4, MPC version 1.0.3, isl version isl-0.16.1-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 08c992208389b44fd477c64ce2b4d084
COMPILER_PATH=/home/martin/codes/utrunk/libexec/gcc/x86_64-pc-linux-gnu/8.0.1/:/home/martin/codes/utrunk/libexec/gcc/x86_64-pc-linux-gnu/8.0.1/:/home/martin/codes/utrunk/libexec/gcc/x86_64-pc-linux-gnu/:/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/:/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/
LIBRARY_PATH=/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/:/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/../../../../lib64/:/lib/x86_64-linux-gnu/:/lib/../lib64/:/usr/lib/x86_64-linux-gnu/:/home/martin/codes/utrunk/lib/gcc/x86_64-pc-linux-gnu/8.0.1/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-O3' '-v' '-S' '-mtune=generic' '-march=x86-64'

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread mar...@mpa-garching.mpg.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #6 from Martin Reinecke  ---
Created attachment 43958
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43958=edit
assembler output with -march=native

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread mar...@mpa-garching.mpg.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #5 from Martin Reinecke  ---
Created attachment 43957
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43957=edit
assembler output without -march=native

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread mar...@mpa-garching.mpg.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #4 from Martin Reinecke  ---
Created attachment 43956
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43956=edit
reduced test case

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #3 from Martin Liška  ---
And please output of adding --verbose option.

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-17 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2018-04-17
 CC||marxin at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Martin Liška  ---
Can't reproduce on Haswell i7 CPU.
Can you Martin please attach assembly (-S) for both native and not native
build?

[Bug tree-optimization/85416] Massive performance regression when switching on "-march=native"

2018-04-16 Thread mar...@mpa-garching.mpg.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85416

--- Comment #1 from Martin Reinecke  ---
Just re-tested on an Intel Core i5-4570; on this CPU, there is no performance
degradation.