http://llvm.org/bugs/show_bug.cgi?id=22428
Bug ID: 22428
Summary: Floating-point "and" not optimized on x86-64
Product: new-bugs
Version: 3.5
Hardware: Macintosh
OS: MacOS X
Status: NEW
Severity: enhancement
Priority: P
Component: new bugs
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected]
Classification: Unclassified
I notice that clang does not generate "vandpd" for floating-point "and"
operations. Here is an example code that demonstrates this:
{{{
#include <math.h>
#include <string.h>
double fand1(double x)
{
unsigned long ix;
memcpy(&ix, &x, 8);
ix &= 0x7fffffffffffffffUL;
memcpy(&x, &ix, 8);
return x;
}
double fand2(double x)
{
return fabs(x);
}
}}}
When I compile this via:
{{{
clang-mp-3.5 -O3 -march=native -S fand.c -o fand-clang-3.5.s
}}}
(OS X, x86-64 CPU, Intel Core i7), this results in:
{{{
_fand1: ## @fand1
pushq %rbp
movq %rsp, %rbp
vmovq %xmm0, %rax
movabsq $9223372036854775807, %rcx ## imm = 0x7FFFFFFFFFFFFFFF
andq %rax, %rcx
vmovq %rcx, %xmm0
popq %rbp
retq
_fand2: ## @fand2
pushq %rbp
movq %rsp, %rbp
vandpd LCPI1_0(%rip), %xmm0, %xmm0
popq %rbp
retq
}}}
This shows that (a) clang performs the bitwise and operation in an integer
register, which is probably slower, while (b) the implementors of "fabs" assume
that using the "vandpd" instruction is faster.
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs