On 09/12/2013 11:09 PM, Hal Finkel wrote:
With the patch this time;)
-Hal
----- Original Message -----
>On 09/12/2013 11:03 PM, Hal Finkel wrote:
> >Hello,
> >
> >Please review the attached patch which restores the libm sqrt* ->
> >@llvm.sqrt* mapping, but only in fast-math mode (specifically,
> >when the UnsafeFPMath or NoNaNsFPMath CodeGen options are
> >enabled). The @llvm.sqrt* intrinsics have slightly different
> >semantics from the libm call, specifically, they are undefined
> >when given a non-zero negative number (the libm calls will always
> >return NaN for any negative number).
> >
> >This mapping was removed in r100613, and replaced with a TODO, but
> >at that time the fast-math flags were not yet implemented. Now
> >that we have these, restoring this mapping is important because it
> >will enable autovectorization of sqrt calls in loops (at least in
> >fast-math mode).
>
>Patch?
>
>Tobias
>
>
-- Hal Finkel Assistant Computational Scientist Leadership Computing
Facility Argonne National Laboratory
sqrt-intrin-fm-mapping.patch
Index: test/CodeGen/libcalls.c
===================================================================
--- test/CodeGen/libcalls.c (revision 190624)
+++ test/CodeGen/libcalls.c (working copy)
@@ -1,8 +1,10 @@
// RUN: %clang_cc1 -fmath-errno -emit-llvm -o - %s -triple
i386-unknown-unknown | FileCheck -check-prefix CHECK-YES %s
// RUN: %clang_cc1 -emit-llvm -o - %s -triple i386-unknown-unknown |
FileCheck -check-prefix CHECK-NO %s
+// RUN: %clang_cc1 -ffast-math -menable-unsafe-fp-math -emit-llvm -o - %s
-triple i386-unknown-unknown | FileCheck -check-prefix CHECK-FAST %s
Do you need -ffast-math here?
// CHECK-YES-LABEL: define void @test_sqrt
// CHECK-NO-LABEL: define void @test_sqrt
+// CHECK-FAST-LABEL: define void @test_sqrt
void test_sqrt(float a0, double a1, long double a2) {
// Following llvm-gcc's lead, we never emit these as intrinsics;
// no-math-errno isn't good enough. We could probably use intrinsics
@@ -27,6 +29,9 @@
// CHECK-NO: declare float @sqrtf(float) [[NUW_RN:#[0-9]+]]
// CHECK-NO: declare double @sqrt(double) [[NUW_RN]]
// CHECK-NO: declare x86_fp80 @sqrtl(x86_fp80) [[NUW_RN]]
+// CHECK-FAST: declare float @llvm.sqrt.f32(float)
+// CHECK-FAST: declare double @llvm.sqrt.f64(double)
+// CHECK-FAST: declare x86_fp80 @llvm.sqrt.f80(x86_fp80)
// CHECK-YES-LABEL: define void @test_pow
// CHECK-NO-LABEL: define void @test_pow
Index: lib/CodeGen/CGBuiltin.cpp
===================================================================
--- lib/CodeGen/CGBuiltin.cpp (revision 190624)
+++ lib/CodeGen/CGBuiltin.cpp (working copy)
@@ -1282,12 +1282,19 @@
case Builtin::BIsqrt:
case Builtin::BIsqrtf:
case Builtin::BIsqrtl: {
- // TODO: there is currently no set of optimizer flags
- // sufficient for us to rewrite sqrt to @llvm.sqrt.
- // -fmath-errno=0 is not good enough; we need finiteness.
- // We could probably precondition the call with an ult
- // against 0, but is that worth the complexity?
- break;
+ // Transform a call to sqrt* into a @llvm.sqrt.* intrinsic call, but only
+ // in finite- or unsafe-math mode (the intrinsic has different semantics
+ // for handling negative numbers compared to the library function, so
+ // -fmath-errno=0 is not enough).
I did not read the surrounding code, but I assume -fmath-errno=0 is
already checked outside.
+ if (!FD->hasAttr<ConstAttr>())
+ break;
+ if (!(CGM.getCodeGenOpts().UnsafeFPMath ||
+ CGM.getCodeGenOpts().NoNaNsFPMath))
Otherwise this looks good from my side, but needs a clang person to
accept it.
Tobias
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits