From: Matthew Malcomson <mmalcom...@nvidia.com> NOT For commit.
Do demo implementation in AArch64 since that's the backend I'm most familiar with. Nothing much else to say -- nice to see that the demo implementation seems to work as expected (being used for fetch_add, add_fetch and sub_fetch even though it's only defined for fetch_sub). Demo implementation ensures that I can run some execution tests. Demo is added behind a flag in order to be able to run the testsuite with different variants (with the flag and without). Ensuring that the functionality worked for both the fallback and when this optab was implemented (also check with the two different fallbacks of either using libatomic or inlining a CAS loop). In order to run with both this and the fallback implementation I use the following flag in RUNTESTFLAGS: --target_board='unix {unix/-mtesting-fp-atomics}' Signed-off-by: Matthew Malcomson <mmalcom...@nvidia.com> --- gcc/config/aarch64/aarch64.h | 2 ++ gcc/config/aarch64/aarch64.opt | 5 +++++ gcc/config/aarch64/atomics.md | 15 +++++++++++++++ 3 files changed, 22 insertions(+) diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 096c853af7f..c34b020f754 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -124,6 +124,8 @@ /* Align global data as an optimization. */ #define DATA_ALIGNMENT(EXP, ALIGN) aarch64_data_alignment (EXP, ALIGN) +#define TARGET_TESTING_FP_ATOMICS (aarch64_flag_testing_fp_atomics) + /* Align stack data as an optimization. */ #define LOCAL_ALIGNMENT(EXP, ALIGN) aarch64_stack_alignment (EXP, ALIGN) diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt index 9ca753e6a88..2cbb711f869 100644 --- a/gcc/config/aarch64/aarch64.opt +++ b/gcc/config/aarch64/aarch64.opt @@ -352,6 +352,11 @@ moutline-atomics Target Var(aarch64_flag_outline_atomics) Init(2) Save Generate local calls to out-of-line atomic operations. +mtesting-fp-atomics +Target Var(aarch64_flag_testing_fp_atomics) Init(0) Save +Use the demonstration implementation of atomic_fetch_sub_<mode> for floating +point modes. + -param=aarch64-vect-compare-costs= Target Joined UInteger Var(aarch64_vect_compare_costs) Init(1) IntegerRange(0, 1) Param When vectorizing, consider using multiple different approaches and use diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md index ea4a9367fc8..e95912a2210 100644 --- a/gcc/config/aarch64/atomics.md +++ b/gcc/config/aarch64/atomics.md @@ -368,6 +368,21 @@ ;; However we also implement the acquire memory barrier with DMB LD, ;; and so the ST<OP> is not blocked by the barrier. +(define_insn "atomic_fetch_sub<mode>" + [(set (match_operand:GPF 0 "register_operand" "=&w") + (match_operand:GPF 1 "aarch64_sync_memory_operand" "+Q")) + (set (match_dup 1) + (unspec_volatile:GPF + [(minus:GPF (match_dup 1) + (match_operand:GPF 2 "register_operand" "w")) + (match_operand:SI 3 "const_int_operand")] + UNSPECV_ATOMIC_LDOP_PLUS)) + (clobber (match_scratch:GPF 4 "=w"))] + "TARGET_TESTING_FP_ATOMICS" + "// Here's your sandwich.\;ldr %<s>0, %1\;fsub %<s>4, %<s>0, %<s>2\;str %<s>4, %1\;// END" +) + + (define_insn "aarch64_atomic_<atomic_ldoptab><mode>_lse" [(set (match_operand:ALLI 0 "aarch64_sync_memory_operand" "+Q") (unspec_volatile:ALLI -- 2.43.0