fetch_max discussion)

Jakub Jelinek Fri, 13 Feb 2026 02:50:03 -0800

On Fri, Feb 13, 2026 at 10:26:00AM +0000, Matthew Malcomson wrote:
> Hi All,
> 
> I'm picking the floating point atomic fetch_add/fetch_sub work back up and
> would like to get confirmation that a re-design of the floating point atomic
> fetch_add/fetch_sub is sensible.
> 
> It seems to me that the points Jakub raised w.r.t. atomic
> fetch_min/fetch_max apply equally well to the floating point atomic
> fetch_add/fetch_sub work that was already upstream.
> 
> Would people agree that adjusting these FP atomic patches to follow the
> alternate approach makes sense?
> 
> 1) The concern about builtin explosion:
>     - Should we follow the same internal-fn approach with these floating
>       point operations as with the redesigned fetch_min/fetch_max
>       builtins?


Yes.

> 2) The need to wait until after OMP offloading to determine whether the
>     target architecture has the operation or not.
>     - I believe this indicates I should move the CAS expansion for these
>       FP builtins from the C frontend to after IPA?

It doesn't mean you need to wait until after IPA, but you need to be
prepared for the decisions done before that to change.  So, if you decide
to emit something as an internal function before IPA (or pattern recognize
some loop as one) because the target has some optab, you need to be able to
lower the ifn back into a loop after IPA in case the target at that point
doesn't have the optab.  Because the optab before IPA vs. optab after IPA
can change only with OpenMP/OpenACC offloading, that additional lowering
could be done in a helper function for
execute_omp_device_lower/execute_oacc_device_lower.
So it is about how much code is needed for that and where.
If you have a builtin and decide based on optab in the FE, you need code
to emit the IFN and code to lower into GENERIC loop at that point, plus
probably in forwprop or wherever else appropriate to pattern match a loop
as an IFN if optab exist, and that offloading lowering of IFN back to loop
(but this time not GENERIC but GIMPLE) if optab doesn't exist in offloading
target.
Another possibility is to lower the builtins always to IFN (lowering to
GENERIC loop not needed), have the pattern recognition only post IPA and
post IPA (but both early if possible) lowering of the IFN to GIMPLE loop
if optab doesn't exist.

> 3) Jakub questioned the need for new libatomic functions w.r.t. min/max
>     - We have no need for these w.r.t. add/sub -- we added them for
>       consistencies sake.  Is there a need for the new FP libatomic
>       functions?

I'd prefer not to add new libatomic APIs if possible, especially floating
point related.  You need to mangle what exact kind of floating point it is,
there are many with different properties (float/double/long double (DFmode,
Intel 80-bit with various paddings, IEEE quad, IBM double double),_Float16,
_Float32,_Float64,_Float128, __bf16, VAX float/double, differences about
NaN bits, whether they have infinities/NaNs or not, flushing to zero,
and -ffast-math (and subflags there of) vs. normal, non-signalling vs.
signalling NaNs.  If libatomic stays away from that, the better...

        Jakub

Re: Atomic FP fetch_add/fetch_sub design (after fetch_min/fetch_max discussion)

Reply via email to