[libclc] [libclc] Refine __clc_fp*_subnormals_supported and __clc_flush_denormal_if_not_supported (PR #157633)

Mészáros Gergely via cfe-commits Sat, 18 Oct 2025 15:57:57 -0700

================
@@ -127,9 +127,9 @@ _CLC_DEF _CLC_OVERLOAD float __clc_sw_fma(float a, float b, 
float c) {
     return c;
   }
 
-  a = __clc_flush_denormal_if_not_supported(a);
-  b = __clc_flush_denormal_if_not_supported(b);
-  c = __clc_flush_denormal_if_not_supported(c);
+  a = __clc_soft_flush_denormal(a);
+  b = __clc_soft_flush_denormal(b);
+  c = __clc_soft_flush_denormal(c);
----------------
Maetveis wrote:


> surely compiler-rt already has an implementation?

It doesn't. LLVM libc has one but it uses FP64, so I don't think it is of much 
help. I'd expect most targets that don't have hardware fma don't have fp64 
either.

I think dropping sw fma would impact:
- SPIR-V, which then starts generating the `GLS.std.450` extended instruction 
FMA. The problem there is that instruction is (AFAICT) allowed to round 
intermediate products, but the OpenCL spec doesn't allow that. I'm not sure if 
drivers actually implement it as fused or not.
Arguably the lowering is bug in LLVM, `@llvm.fma` is specified to be fused 
without fast math.
- Not all old R600 targets have FMA, I think this change would be breaking 
them. These are >10 years old GPUs at this point though.

https://github.com/llvm/llvm-project/pull/157633
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Refine __clc_fp*_subnormals_supported and __clc_flush_denormal_if_not_supported (PR #157633)

Reply via email to