================
@@ -2550,6 +2554,11 @@ static Value *upgradeNVVMIntrinsicCall(StringRef Name,
CallBase *CI,
Intrinsic::ID IID = (Name == "fabs.ftz.f") ? Intrinsic::nvvm_fabs_ftz
: Intrinsic::nvvm_fabs;
Rep = Builder.CreateUnaryIntrinsic(IID, CI->getArgOperand(0));
+ } else if (Name.consume_front("ex2.approx.")) {
+ // nvvm.ex2.approx.{f,ftz.f,d,f16x2}
+ Intrinsic::ID IID = Name.starts_with("ftz") ?
Intrinsic::nvvm_ex2_approx_ftz
+ : Intrinsic::nvvm_ex2_approx;
----------------
AlexMaclean wrote:
I think we're doing this in the backend because if an `llvm.exp2` intrinsic
gets there this is the most similar PTX instruction. It's not necessarily a
strictly correct lowering as far as I can tell. I think if we encounter one of
the `nvvm.ex2.approx` instructions we're obligated to ensure that the value
produced is the exact same as if it were executed on hardware. If we were to
convert to the generic instruction more precise constant folding might occur
which would be incorrect.
https://github.com/llvm/llvm-project/pull/165446
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits