http://llvm.org/bugs/show_bug.cgi?id=20900

            Bug ID: 20900
           Summary: optimize reciprocal square root with fast-math (x86)
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: X86
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]
    Classification: Unclassified

$ ./clang -v
clang version 3.6.0 (217530)
Target: x86_64-apple-darwin13.3.0
Thread model: posix

$ cat rsqrt.c 
#include <math.h>
float reciprocal_square_root(float x) {
    return 1.0f / sqrtf(x);
}

$ ./clang -O2 -ffast-math -S -o - rsqrt.c 
...
    sqrtss    %xmm0, %xmm1
    movss    LCPI0_0(%rip), %xmm0
    divss    %xmm1, %xmm0


---------------------------------------------------------------------

This should be optimized to use 'rsqrtss'.

ICC 14 does this at -O2:

        rsqrtss   %xmm0, %xmm2
        mulss     %xmm2, %xmm0
        mulss     %xmm2, %xmm0
        movss     L_2il0floatpacket.2(%rip), %xmm1
        mulss     %xmm1, %xmm2
        subss     L_2il0floatpacket.1(%rip), %xmm0
        mulss     %xmm2, %xmm0
        ret       
L_2il0floatpacket.1:
    .long    0x40400000
L_2il0floatpacket.2:
    .long    0xbf000000

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs

Reply via email to