karolherbst wrote: Okay, seems like the division is causing problems:
```diff diff --git a/libclc/clc/lib/generic/math/clc_atan2.inc b/libclc/clc/lib/generic/math/clc_atan2.inc index f8e7c9638180..99929acf1e80 100644 --- a/libclc/clc/lib/generic/math/clc_atan2.inc +++ b/libclc/clc/lib/generic/math/clc_atan2.inc @@ -22,7 +22,8 @@ _CLC_OVERLOAD _CLC_CONST _CLC_DEF __CLC_FLOATN __clc_atan2(__CLC_FLOATN y, __CLC_FLOATN v = __clc_fmin(ax, ay); __CLC_FLOATN u = __clc_fmax(ax, ay); - __CLC_FLOATN vbyu = v / u; + __CLC_FLOATN s = u > 0x1.0p+96f ? 0x1.0p-32f : 1.0f; + __CLC_FLOATN vbyu = s * (v / (s * u)); __CLC_FLOATN a = __clc_atan_reduced(vbyu); ``` Which was done in the old version fixing the issue for me. Please be aware, that many drivers implement division as `(1.0 / b) / a`, because the hardware lacks a native `fdiv`, with some scaling to account for big/small numbers (like nvidia and rusticl do). And divisions in CLC only have a guaranteed precision of 2.5 ULP (FULL_PROFILE) or 3.0 ULP (EMBEDDED_PROFILE) anyway. I'm seeing that in my case divide has a max ULP of 2.0, but it could also run into denorms given the big input, which the scaling _should_ prevent tho... https://github.com/llvm/llvm-project/pull/188706 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
