https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902
Rocco Tormenta <rocco at tormenta dot eu> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rocco at tormenta dot eu
--- Comment #31 from Rocco Tormenta <rocco at tormenta dot eu> ---
Hello, I have another basic example. I encountered this issue today while
trying to calculate the squared n-dimensional Euclidean distance between two
points. I apologize if this is not the same issue, though I think it is at
least related given that it broke on the same version.
https://gcc.godbolt.org/z/1cTcazh3Y
That workspace includes proof of concept code, as well as examples of known
workarounds.
The relevant function is:
float nd_sq_euclid(float *a, float *b, int n) {
float dist = 0.0;
for (int i = 0; i < n; i++) {
float d1 = a[i] - b[i];
dist += d1 * d1;
}
return dist;
}
>From the generated assembly alone, you can see that some of the code paths
(namely, for n >= 3) do not use FMA.
I included some values (a, b, c) that have different results for FMA and
non-FMA operations, so it is easier to see the difference. As you can see, by
padding the input with zeroes and increasing n, it works fine for n=1 and n=2,
but breaks starting from n=3.