[Bug target/105246] [amdgcn] Use library call for SQRT with -ffast-math + provide additional option to use single-precsion opcode

burnus at gcc dot gnu.org via Gcc-bugs Fri, 13 Mar 2026 09:38:10 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105246


--- Comment #4 from Tobias Burnus <burnus at gcc dot gnu.org> ---
> Namely for x = sqrt(a)
>   x_(n+1) = ½ (x_n + a / x_n)

Note that this obviously fails for a == 0 - and I have no idea how this behaves
for subnormal numbers and other values close to 0.

The expected behavior is otherwise:

* For finite values of x < -0, a domain error shall occur, and either a NaN (if
supported), or an implementation-defined value shall be returned.

* If x is NaN, a NaN shall be returned.

* If x is ±0 or +Inf, x shall be returned.

* If x is -Inf, a domain error shall occur, and a NaN shall be returned.

* * * 

A more algorithmic (Fortran code ...) is:

>From https://www.math.utah.edu/~beebe/software/ieee/sqrt.pdf:

Page 19, (7) has the same equation as above. And:
"With this approximation, Newton iterations … Thus, in single precision, two
iterations suffice to generate a result correct to more than the 24 bits that
we actually store in single precision. Three iterations are enough for IEEE
double precision."

Which implies that - if the number is finite and > 0 - a single iteration seems
to be enough - otherwise, the result should be equal to the GCN's sqrt
intrinsic (assuming the larger exponent is correctly handled as are NAN/+INF
results).
No idea whether that gives a sensible result for denormal numbers but it
shouldn't be too bad for -funsafe-math-optimizations

Thus it would be:

  x = gcn_sqrt (a)
  if (x != 0 && __builtin_isfinite(x))
    x = (x + a/x) / 2.

where the second part of the condition can be left out with -ffinite-math-only.

[Bug target/105246] [amdgcn] Use library call for SQRT with -ffast-math + provide additional option to use single-precsion opcode

Reply via email to