Re: [arm-embedded] [PATCH, ARM 3/7] Refactor atomic compare_and_swap to make it fit for ARMv8-M Baseline
On 22/09/16 17:41, Thomas Preudhomme wrote: Hi, We've decided to apply the following patch to ARM/embedded-6-branch. Sorry I meant ARM/embedded-5-branch. This has just been applied on ARM/embedded-6-branch as well 1 day ago (2016-10-26). Best regards, Thomas
[arm-embedded] [PATCH, ARM 3/7] Refactor atomic compare_and_swap to make it fit for ARMv8-M Baseline
Hi, We've decided to apply the following patch to ARM/embedded-6-branch. Best regards, Thomas --- Begin Message --- Hi, This patch is part of a patch series to add support for atomic operations on ARMv8-M Baseline targets in GCC. This specific patch refactors the expander and splitter for atomics to make the logic work with ARMv8-M Baseline which has limitation of Thumb-1 in terms of CC flag setting and different conditional compare insn patterns. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2016-09-02 Thomas Preud'homme * config/arm/arm.c (arm_expand_compare_and_swap): Add new bdst local variable. Add the new parameter to the insn generator. Set that parameter to be CC flag for 32-bit targets, bval otherwise. Set the return value from the negation of that parameter for Thumb-1, keeping the logic unchanged otherwise except for using bdst as the destination register of the compare_and_swap insn. (arm_split_compare_and_swap): Add explanation about how is the value returned to the function comment. Rename scratch variable to neg_bval. Adapt initialization of variables holding operands to the new operand numbers. Use return register to hold result of store exclusive for Thumb-1, scratch register otherwise. Construct the appropriate cbranch for Thumb-1 targets, keeping the logic unchanged for 32-bit targets. Guard Z flag setting to restrict to 32bit targets. Use gen_cbranchsi4 rather than hand-written conditional branch to loop for strongly ordered compare_and_swap. * config/arm/predicates.md (cc_register_operand): New predicate. * config/arm/sync.md (atomic_compare_and_swap_1): Use a match_operand with the new predicate to accept either the CC flag or a destination register for the boolean return value, restricting it to CC flag only via constraint. Adapt operand numbers accordingly. Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all atomic and synchronization testcases in the testsuite [2]. Patchset was also bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at optimization level -O1 and above [1] without any regression in the testsuite and no code generation difference in libitm and libgomp. Code generation for ARMv8-M Baseline has been manually examined and compared against ARMv8-A Thumb-2 for the following configuration without finding any issue: gcc.dg/atomic-op-2.c at -Os gcc.dg/atomic-compare-exchange-2.c at -Os gcc.dg/atomic-compare-exchange-3.c at -O3 Is this ok for trunk? Best regards, Thomas [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and undefined ("-O2 -g") [2] The exact list is: gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c gcc/testsuite/gcc.dg/atomic-exchange-1.c gcc/testsuite/gcc.dg/atomic-exchange-2.c gcc/testsuite/gcc.dg/atomic-exchange-3.c gcc/testsuite/gcc.dg/atomic-fence.c gcc/testsuite/gcc.dg/atomic-flag.c gcc/testsuite/gcc.dg/atomic-generic.c gcc/testsuite/gcc.dg/atomic-generic-aux.c gcc/testsuite/gcc.dg/atomic-invalid-2.c gcc/testsuite/gcc.dg/atomic-load-1.c gcc/testsuite/gcc.dg/atomic-load-2.c gcc/testsuite/gcc.dg/atomic-load-3.c gcc/testsuite/gcc.dg/atomic-lockfree.c gcc/testsuite/gcc.dg/atomic-lockfree-aux.c gcc/testsuite/gcc.dg/atomic-noinline.c gcc/testsuite/gcc.dg/atomic-noinline-aux.c gcc/testsuite/gcc.dg/atomic-op-1.c gcc/testsuite/gcc.dg/atomic-op-2.c gcc/testsuite/gcc.dg/atomic-op-3.c gcc/testsuite/gcc.dg/atomic-op-6.c gcc/testsuite/gcc.dg/atomic-store-1.c gcc/testsuite/gcc.dg/atomic-store-2.c gcc/testsuite/gcc.dg/atomic-store-3.c gcc/testsuite/g++.dg/ext/atomic-1.C gcc/testsuite/g++.dg/ext/atomic-2.C gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c gcc/testsuite/gcc.target/arm/atomic-op-acquire.c gcc/testsuite/gcc.target/arm/atomic-op-char.c gcc/testsuite/gcc.target/arm/atomic-op-consume.c gcc/testsuite/gcc.target/arm/atomic-op-int.c gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c gcc/testsuite/gcc.target/arm/atomic-op-release.c gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c gcc/testsuite/gcc.target/arm/atomic-op-short.c gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c gcc/testsuite/gcc.target/arm/sync-1.c gcc/testsuite/gcc.target/arm/synchronize.c gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c gcc/testsuite/gcc.target/arm/armv8-sync-op-acquir
[PATCH, ARM 3/7] Refactor atomic compare_and_swap to make it fit for ARMv8-M Baseline
Hi, This patch is part of a patch series to add support for atomic operations on ARMv8-M Baseline targets in GCC. This specific patch refactors the expander and splitter for atomics to make the logic work with ARMv8-M Baseline which has limitation of Thumb-1 in terms of CC flag setting and different conditional compare insn patterns. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2016-09-02 Thomas Preud'homme * config/arm/arm.c (arm_expand_compare_and_swap): Add new bdst local variable. Add the new parameter to the insn generator. Set that parameter to be CC flag for 32-bit targets, bval otherwise. Set the return value from the negation of that parameter for Thumb-1, keeping the logic unchanged otherwise except for using bdst as the destination register of the compare_and_swap insn. (arm_split_compare_and_swap): Add explanation about how is the value returned to the function comment. Rename scratch variable to neg_bval. Adapt initialization of variables holding operands to the new operand numbers. Use return register to hold result of store exclusive for Thumb-1, scratch register otherwise. Construct the appropriate cbranch for Thumb-1 targets, keeping the logic unchanged for 32-bit targets. Guard Z flag setting to restrict to 32bit targets. Use gen_cbranchsi4 rather than hand-written conditional branch to loop for strongly ordered compare_and_swap. * config/arm/predicates.md (cc_register_operand): New predicate. * config/arm/sync.md (atomic_compare_and_swap_1): Use a match_operand with the new predicate to accept either the CC flag or a destination register for the boolean return value, restricting it to CC flag only via constraint. Adapt operand numbers accordingly. Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all atomic and synchronization testcases in the testsuite [2]. Patchset was also bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at optimization level -O1 and above [1] without any regression in the testsuite and no code generation difference in libitm and libgomp. Code generation for ARMv8-M Baseline has been manually examined and compared against ARMv8-A Thumb-2 for the following configuration without finding any issue: gcc.dg/atomic-op-2.c at -Os gcc.dg/atomic-compare-exchange-2.c at -Os gcc.dg/atomic-compare-exchange-3.c at -O3 Is this ok for trunk? Best regards, Thomas [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and undefined ("-O2 -g") [2] The exact list is: gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c gcc/testsuite/gcc.dg/atomic-exchange-1.c gcc/testsuite/gcc.dg/atomic-exchange-2.c gcc/testsuite/gcc.dg/atomic-exchange-3.c gcc/testsuite/gcc.dg/atomic-fence.c gcc/testsuite/gcc.dg/atomic-flag.c gcc/testsuite/gcc.dg/atomic-generic.c gcc/testsuite/gcc.dg/atomic-generic-aux.c gcc/testsuite/gcc.dg/atomic-invalid-2.c gcc/testsuite/gcc.dg/atomic-load-1.c gcc/testsuite/gcc.dg/atomic-load-2.c gcc/testsuite/gcc.dg/atomic-load-3.c gcc/testsuite/gcc.dg/atomic-lockfree.c gcc/testsuite/gcc.dg/atomic-lockfree-aux.c gcc/testsuite/gcc.dg/atomic-noinline.c gcc/testsuite/gcc.dg/atomic-noinline-aux.c gcc/testsuite/gcc.dg/atomic-op-1.c gcc/testsuite/gcc.dg/atomic-op-2.c gcc/testsuite/gcc.dg/atomic-op-3.c gcc/testsuite/gcc.dg/atomic-op-6.c gcc/testsuite/gcc.dg/atomic-store-1.c gcc/testsuite/gcc.dg/atomic-store-2.c gcc/testsuite/gcc.dg/atomic-store-3.c gcc/testsuite/g++.dg/ext/atomic-1.C gcc/testsuite/g++.dg/ext/atomic-2.C gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c gcc/testsuite/gcc.target/arm/atomic-op-acquire.c gcc/testsuite/gcc.target/arm/atomic-op-char.c gcc/testsuite/gcc.target/arm/atomic-op-consume.c gcc/testsuite/gcc.target/arm/atomic-op-int.c gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c gcc/testsuite/gcc.target/arm/atomic-op-release.c gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c gcc/testsuite/gcc.target/arm/atomic-op-short.c gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c gcc/testsuite/gcc.target/arm/sync-1.c gcc/testsuite/gcc.target/arm/synchronize.c gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c libstdc++-v3/