Re: [PATCH] Restore the sqrt -> llvm.sqrt mapping in fast-math mode

Hal Finkel Thu, 12 Sep 2013 14:35:37 -0700

----- Original Message -----
> On 09/12/2013 11:09 PM, Hal Finkel wrote:
> > With the patch this time;)
> >
> >   -Hal
> >
> > ----- Original Message -----
> >> >On 09/12/2013 11:03 PM, Hal Finkel wrote:
> >>> > >Hello,
> >>> > >
> >>> > >Please review the attached patch which restores the libm sqrt*
> >>> > >->
> >>> > >@llvm.sqrt* mapping, but only in fast-math mode (specifically,
> >>> > >when the UnsafeFPMath or NoNaNsFPMath CodeGen options are
> >>> > >enabled). The @llvm.sqrt* intrinsics have slightly different
> >>> > >semantics from the libm call, specifically, they are undefined
> >>> > >when given a non-zero negative number (the libm calls will
> >>> > >always
> >>> > >return NaN for any negative number).
> >>> > >
> >>> > >This mapping was removed in r100613, and replaced with a TODO,
> >>> > >but
> >>> > >at that time the fast-math flags were not yet implemented. Now
> >>> > >that we have these, restoring this mapping is important
> >>> > >because it
> >>> > >will enable autovectorization of sqrt calls in loops (at least
> >>> > >in
> >>> > >fast-math mode).
> >> >
> >> >Patch?
> >> >
> >> >Tobias
> >> >
> >> >
> > -- Hal Finkel Assistant Computational Scientist Leadership
> > Computing
> > Facility Argonne National Laboratory
> >
> >
> > sqrt-intrin-fm-mapping.patch
> >
> >
> > Index: test/CodeGen/libcalls.c
> > ===================================================================
> > --- test/CodeGen/libcalls.c (revision 190624)
> > +++ test/CodeGen/libcalls.c (working copy)
> > @@ -1,8 +1,10 @@
> >   // RUN: %clang_cc1 -fmath-errno -emit-llvm -o - %s -triple
> >   i386-unknown-unknown | FileCheck -check-prefix CHECK-YES %s
> >   // RUN: %clang_cc1 -emit-llvm -o - %s -triple
> >   i386-unknown-unknown | FileCheck -check-prefix CHECK-NO %s
> > +// RUN: %clang_cc1 -ffast-math -menable-unsafe-fp-math -emit-llvm
> > -o - %s -triple i386-unknown-unknown | FileCheck -check-prefix
> > CHECK-FAST %s
> 
> Do you need -ffast-math here?


No (revised patch attached).

> 
> >
> >   // CHECK-YES-LABEL: define void @test_sqrt
> >   // CHECK-NO-LABEL: define void @test_sqrt
> > +// CHECK-FAST-LABEL: define void @test_sqrt
> >   void test_sqrt(float a0, double a1, long double a2) {
> >     // Following llvm-gcc's lead, we never emit these as
> >     intrinsics;
> >     // no-math-errno isn't good enough.  We could probably use
> >     intrinsics
> > @@ -27,6 +29,9 @@
> >   // CHECK-NO: declare float @sqrtf(float) [[NUW_RN:#[0-9]+]]
> >   // CHECK-NO: declare double @sqrt(double) [[NUW_RN]]
> >   // CHECK-NO: declare x86_fp80 @sqrtl(x86_fp80) [[NUW_RN]]
> > +// CHECK-FAST: declare float @llvm.sqrt.f32(float)
> > +// CHECK-FAST: declare double @llvm.sqrt.f64(double)
> > +// CHECK-FAST: declare x86_fp80 @llvm.sqrt.f80(x86_fp80)
> >
> >   // CHECK-YES-LABEL: define void @test_pow
> >   // CHECK-NO-LABEL: define void @test_pow
> > Index: lib/CodeGen/CGBuiltin.cpp
> > ===================================================================
> > --- lib/CodeGen/CGBuiltin.cpp       (revision 190624)
> > +++ lib/CodeGen/CGBuiltin.cpp       (working copy)
> > @@ -1282,12 +1282,19 @@
> >     case Builtin::BIsqrt:
> >     case Builtin::BIsqrtf:
> >     case Builtin::BIsqrtl: {
> > -    // TODO: there is currently no set of optimizer flags
> > -    // sufficient for us to rewrite sqrt to @llvm.sqrt.
> > -    // -fmath-errno=0 is not good enough; we need finiteness.
> > -    // We could probably precondition the call with an ult
> > -    // against 0, but is that worth the complexity?
> > -    break;
> > +    // Transform a call to sqrt* into a @llvm.sqrt.* intrinsic
> > call, but only
> > +    // in finite- or unsafe-math mode (the intrinsic has different
> > semantics
> > +    // for handling negative numbers compared to the library
> > function, so
> > +    // -fmath-errno=0 is not enough).
> 
> I did not read the surrounding code, but I assume -fmath-errno=0 is
> already checked outside.

I believe that this is taken care of by the "if (!FD->hasAttr<ConstAttr>())" 
check... because of the LIBBUILTIN definition, this will get that attribute 
when -fmath-errno=0.

> 
> > +    if (!FD->hasAttr<ConstAttr>())
> > +      break;
> > +    if (!(CGM.getCodeGenOpts().UnsafeFPMath ||
> > +          CGM.getCodeGenOpts().NoNaNsFPMath))
> 
> Otherwise this looks good from my side, but needs a clang person to
> accept it.

Yep, thanks again!

 -Hal

> 
> Tobias
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Index: test/CodeGen/libcalls.c
===================================================================
--- test/CodeGen/libcalls.c	(revision 190624)
+++ test/CodeGen/libcalls.c	(working copy)
@@ -1,8 +1,10 @@
 // RUN: %clang_cc1 -fmath-errno -emit-llvm -o - %s -triple i386-unknown-unknown | FileCheck -check-prefix CHECK-YES %s
 // RUN: %clang_cc1 -emit-llvm -o - %s -triple i386-unknown-unknown | FileCheck -check-prefix CHECK-NO %s
+// RUN: %clang_cc1 -menable-unsafe-fp-math -emit-llvm -o - %s -triple i386-unknown-unknown | FileCheck -check-prefix CHECK-FAST %s
 
 // CHECK-YES-LABEL: define void @test_sqrt
 // CHECK-NO-LABEL: define void @test_sqrt
+// CHECK-FAST-LABEL: define void @test_sqrt
 void test_sqrt(float a0, double a1, long double a2) {
   // Following llvm-gcc's lead, we never emit these as intrinsics;
   // no-math-errno isn't good enough.  We could probably use intrinsics
@@ -27,6 +29,9 @@
 // CHECK-NO: declare float @sqrtf(float) [[NUW_RN:#[0-9]+]]
 // CHECK-NO: declare double @sqrt(double) [[NUW_RN]]
 // CHECK-NO: declare x86_fp80 @sqrtl(x86_fp80) [[NUW_RN]]
+// CHECK-FAST: declare float @llvm.sqrt.f32(float)
+// CHECK-FAST: declare double @llvm.sqrt.f64(double)
+// CHECK-FAST: declare x86_fp80 @llvm.sqrt.f80(x86_fp80)
 
 // CHECK-YES-LABEL: define void @test_pow
 // CHECK-NO-LABEL: define void @test_pow
Index: lib/CodeGen/CGBuiltin.cpp
===================================================================
--- lib/CodeGen/CGBuiltin.cpp	(revision 190624)
+++ lib/CodeGen/CGBuiltin.cpp	(working copy)
@@ -1282,12 +1282,19 @@
   case Builtin::BIsqrt:
   case Builtin::BIsqrtf:
   case Builtin::BIsqrtl: {
-    // TODO: there is currently no set of optimizer flags
-    // sufficient for us to rewrite sqrt to @llvm.sqrt.
-    // -fmath-errno=0 is not good enough; we need finiteness.
-    // We could probably precondition the call with an ult
-    // against 0, but is that worth the complexity?
-    break;
+    // Transform a call to sqrt* into a @llvm.sqrt.* intrinsic call, but only
+    // in finite- or unsafe-math mode (the intrinsic has different semantics
+    // for handling negative numbers compared to the library function, so
+    // -fmath-errno=0 is not enough).
+    if (!FD->hasAttr<ConstAttr>())
+      break;
+    if (!(CGM.getCodeGenOpts().UnsafeFPMath ||
+          CGM.getCodeGenOpts().NoNaNsFPMath))
+      break;
+    Value *Arg0 = EmitScalarExpr(E->getArg(0));
+    llvm::Type *ArgType = Arg0->getType();
+    Value *F = CGM.getIntrinsic(Intrinsic::sqrt, ArgType);
+    return RValue::get(Builder.CreateCall(F, Arg0));
   }
 
   case Builtin::BIpow:

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [PATCH] Restore the sqrt -> llvm.sqrt mapping in fast-math mode

Reply via email to