jhuber6 wrote:

Well, I suppose technically there's the weird FTZ handling. I don't know if 
libclc wants to copy this, the nvvm_reflect calls basically take compile flags 
and statically lower it before reaching the backend, really hacky.
```llvm
define float @__nv_rsqrtf(float %x) #0 {
  %1 = call i32 @__nvvm_reflect(ptr @.str)
  %2 = icmp ne i32 %1, 0
  br i1 %2, label %3, label %5

3:                                                ; preds = %0
  %4 = call float @llvm.nvvm.rsqrt.approx.ftz.f(float %x)
  br label %7

5:                                                ; preds = %0
  %6 = call float @llvm.nvvm.rsqrt.approx.f(float %x)
  br label %7

7:                                                ; preds = %5, %3
  %.0 = phi float [ %4, %3 ], [ %6, %5 ]
  ret float %.0
}
```

https://github.com/llvm/llvm-project/pull/205709
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to