https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859
--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> --- (In reply to Thorsten Kurth from comment #13) > the compiler options are just -fopenmp. I am sure it does not have to do > anything with vectorization as I compare the code runtime with and without > the target directives and thus vectorization should be the same between > them. The remaining OpenMP sections are the same. In our work we have not > seen 10x because of insufficient vectorization, it is usually because of > cache locality but that is the same for OMP 4.5 and OMP 3 because the loops > are not touched. > I do not specify an ISA choice, but I will try specifying KNL now and will > tell you what the compiler is going to do. The compiler doesn't optimize by default (i.e. default is -O0), so if you are measuring -O0 -fopenmp performance or code size, that is something that is completely uninteresting. For -O0 the most important is compilation speed, not quality of generated code. For runtime performance of generated code only -O2, -O3 or -Ofast are optimization levels that make sense.