================
@@ -127,9 +127,9 @@ _CLC_DEF _CLC_OVERLOAD float __clc_sw_fma(float a, float b, 
float c) {
     return c;
   }
 
-  a = __clc_flush_denormal_if_not_supported(a);
-  b = __clc_flush_denormal_if_not_supported(b);
-  c = __clc_flush_denormal_if_not_supported(c);
+  a = __clc_soft_flush_denormal(a);
+  b = __clc_soft_flush_denormal(b);
+  c = __clc_soft_flush_denormal(c);
----------------
Maetveis wrote:

> surely compiler-rt already has an implementation?

It doesn't. LLVM libc has one but it uses FP64, so I don't think it is of much 
help. I'd expect most targets that don't have hardware fma don't have fp64 
either.

I think dropping sw fma would impact:
- SPIR-V, which then starts generating the `GLS.std.450` extended instruction 
FMA. The problem there is that instruction is (AFAICT) allowed to round 
intermediate products, but the OpenCL spec doesn't allow that. I'm not sure if 
drivers actually implement it as fused or not.
Arguably the lowering is bug in LLVM, `@llvm.fma` is specified to be fused 
without fast math.
- Not all old R600 targets have FMA, I think this change would be breaking 
them. These are >10 years old GPUs at this point though.

https://github.com/llvm/llvm-project/pull/157633
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to