jhuber6 wrote:
Well, I suppose technically there's the weird FTZ handling. I don't know if
libclc wants to copy this, the nvvm_reflect calls basically take compile flags
and statically lower it before reaching the backend, really hacky.
```llvm
define float @__nv_rsqrtf(float %x) #0 {
%1 = call i32 @__nvvm_reflect(ptr @.str)
%2 = icmp ne i32 %1, 0
br i1 %2, label %3, label %5
3: ; preds = %0
%4 = call float @llvm.nvvm.rsqrt.approx.ftz.f(float %x)
br label %7
5: ; preds = %0
%6 = call float @llvm.nvvm.rsqrt.approx.f(float %x)
br label %7
7: ; preds = %5, %3
%.0 = phi float [ %4, %3 ], [ %6, %5 ]
ret float %.0
}
```
https://github.com/llvm/llvm-project/pull/205709
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits