On 8/9/23 07:51, Alexander Monakov wrote:

On Wed, 9 Aug 2023, Richard Biener via Gcc-patches wrote:

The following teaches the non-loop reduction vectorization code to
handle non-associatable reductions.  Using the existing FOLD_LEFT_PLUS
internal functions might be possible but I'd have to convince myself
that +0.0 + x[0] is a safe extra operation in ever rounding mode
(I also have no way to test the resulting code).

It's not. Under our default -fno-signaling-nans -fno-rounding-math
negative zero is the neutral element for addition, so '-0.0 + x[0]'
might be (but negative zero costs more to materialize).

If the reduction has at least two elements, then

        -0.0 + x[0] + x[1]

has the same behavior w.r.t SNaNs as 'x[0] + x[1]', but unfortunately
yields negative zero when x[0] = x[1] = +0.0 and rounding towards
negative infinity (unlike x[0] + x[1], which is +0.0).
Hmm, then there's a bug in an non-released port I worked on a while back. It supports FOLD_LEFT_PLUS by starting the sequence with a +0.0 in the destination register.

I guess if that port ever gets upstreamed I'll have to keep an eye out for that problem. Luckily I think they can synthesize a -0.0 trivially, potentially even zero cost.

Thanks!
Jeff

Reply via email to