http://llvm.org/bugs/show_bug.cgi?id=20043
Bug ID: 20043
Summary: Only one version of FMA3 instruction is being
generated
Product: clang
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: -New Bugs
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected]
Classification: Unclassified
Given the following code:
#include <immintrin.h>
__m128 fmatest(__m128 x)
{
return _mm_fmadd_ps(x, _mm_set1_ps(2.0f), _mm_set1_ps(-1.0f));
}
I get the following output from Clang 3.4 (using -O3 -march=core-avx2):
.LCPI0_0:
.long 3212836864 # float -1
.LCPI0_1:
.long 1073741824 # float 2
fmatest(float __vector(4)): # @fmatest(float
__vector(4))
vbroadcastss xmm2, dword ptr [rip + .LCPI0_0]
vbroadcastss xmm1, dword ptr [rip + .LCPI0_1]
vfmadd213ps xmm1, xmm0, xmm2
vmovaps xmm0, xmm1
ret
The vmovaps would be unnecessary if an alternate fmadd instruction were used.
For instance this is what GCC 4.9 produces:
fmatest(float __vector):
vmovaps xmm1, XMMWORD PTR .LC1[rip]
vfmadd132ps xmm0, xmm1, XMMWORD PTR .LC0[rip]
ret
.LC0:
.long 1073741824
.long 1073741824
.long 1073741824
.long 1073741824
.LC1:
.long 3212836864
.long 3212836864
.long 3212836864
.long 3212836864
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs