https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67406

            Bug ID: 67406
           Summary: OMP SIMD cloning does not generate fma instruction for
                    AVX2 target
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vincenzo.innocente at cern dot ch
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---

given
at simdCloning.cc 
#pragma omp declare simd notinbranch
float fma(float x,float y, float z) {
   return x+y*z;
}

compiled with
c++ -S -fopenmp -Ofast -Wall simdCloning.cc; cat simdCloning.s
will generate the same code for AVX and AVX2 clones
__ZGVdN8vvv__Z3fmafff:
LFB3:
        leaq    8(%rsp), %r10
LCFI5:
        andq    $-32, %rsp
        vmulps  %ymm2, %ymm1, %ymm1
        pushq   -8(%r10)
        pushq   %rbp
        vaddps  %ymm0, %ymm1, %ymm0

while I would have expected
__ZGVdN8vvv__Z3fmafff:
LFB3:
        leaq    8(%rsp), %r10
LCFI5:
        andq    $-32, %rsp
        vfmadd231ps     %ymm2, %ymm1, %ymm0
        pushq   -8(%r10)
        pushq   %rbp


this last code has been obtained compiling with -mfma.
unfortunately in this case ALL clones uses avx2 instructions
(so again AVX and AVX2 clones are identical)


btw: is there any reason why the AVX512 clone is not generated?
I am using gcc version 6.0.0 20150801 (experimental) [trunk revision 226463]
(GCC)

Reply via email to