[clang] [llvm] [NVPTX] Add ex2.approx bf16 support and cleanup intrinsic definition (PR #165446)

Alex MacLean via cfe-commits Tue, 28 Oct 2025 12:15:35 -0700

================
@@ -2550,6 +2554,11 @@ static Value *upgradeNVVMIntrinsicCall(StringRef Name, 
CallBase *CI,
     Intrinsic::ID IID = (Name == "fabs.ftz.f") ? Intrinsic::nvvm_fabs_ftz
                                                : Intrinsic::nvvm_fabs;
     Rep = Builder.CreateUnaryIntrinsic(IID, CI->getArgOperand(0));
+  } else if (Name.consume_front("ex2.approx.")) {
+    // nvvm.ex2.approx.{f,ftz.f,d,f16x2}
+    Intrinsic::ID IID = Name.starts_with("ftz") ? 
Intrinsic::nvvm_ex2_approx_ftz
+                                                : Intrinsic::nvvm_ex2_approx;
----------------
AlexMaclean wrote:


I think we're doing this in the backend because if an `llvm.exp2` intrinsic 
gets there this is the most similar PTX instruction. It's not necessarily a 
strictly correct lowering as far as I can tell.  I think if we encounter one of 
the `nvvm.ex2.approx` instructions we're obligated to ensure that the value 
produced is the exact same as if it were executed on hardware. If we were to 
convert to the generic instruction more precise constant folding might occur 
which would be incorrect. 

https://github.com/llvm/llvm-project/pull/165446
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add ex2.approx bf16 support and cleanup intrinsic definition (PR #165446)

Reply via email to