http://llvm.org/bugs/show_bug.cgi?id=22428

            Bug ID: 22428
           Summary: Floating-point "and" not optimized on x86-64
           Product: new-bugs
           Version: 3.5
          Hardware: Macintosh
                OS: MacOS X
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]
    Classification: Unclassified

I notice that clang does not generate "vandpd" for floating-point "and"
operations. Here is an example code that demonstrates this:
{{{
#include <math.h>
#include <string.h>
double fand1(double x)
{
  unsigned long ix;
  memcpy(&ix, &x, 8);
  ix &= 0x7fffffffffffffffUL;
  memcpy(&x, &ix, 8);
  return x;
}
double fand2(double x)
{
  return fabs(x);
}
}}}

When I compile this via:
{{{
clang-mp-3.5 -O3 -march=native -S fand.c -o fand-clang-3.5.s
}}}
(OS X, x86-64 CPU, Intel Core i7), this results in:
{{{
_fand1:                                 ## @fand1
    pushq    %rbp
    movq    %rsp, %rbp
    vmovq    %xmm0, %rax
    movabsq    $9223372036854775807, %rcx ## imm = 0x7FFFFFFFFFFFFFFF
    andq    %rax, %rcx
    vmovq    %rcx, %xmm0
    popq    %rbp
    retq

_fand2:                                 ## @fand2
    pushq    %rbp
    movq    %rsp, %rbp
    vandpd    LCPI1_0(%rip), %xmm0, %xmm0
    popq    %rbp
    retq
}}}

This shows that (a) clang performs the bitwise and operation in an integer
register, which is probably slower, while (b) the implementors of "fabs" assume
that using the "vandpd" instruction is faster.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs

Reply via email to