> -----Original Message----- > From: Alex Coplan <alex.cop...@arm.com> > Sent: Thursday, March 31, 2022 5:20 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Earnshaw <richard.earns...@arm.com>; Kyrylo Tkachov > <kyrylo.tkac...@arm.com> > Subject: [PATCH][GCC 9] arm: Fix ICEs with compare-and-swap and - > march=armv8-m.base [PR99977] > > Hi, > > This is a backport of the fix for PR99977 to the GCC 9 branch. The only > case where the GCC 10 patch did not apply cleanly was on sync.md, where > some of the context has changed, but the substance of the patch has not > changed, it simply required applying by hand. > > Tested as follows: > - Bootstrap/regtest on arm-linux-gnueabihf. > - Regression tested a cross compiler configured with > --with-arch=armv8-m.base. > > OK for the GCC 9 branch? >
Ok. Thanks, Kyrill > Thanks, > Alex > > --- > > The PR shows two ICEs with __sync_bool_compare_and_swap and > -mcpu=cortex-m23 (equivalently, -march=armv8-m.base): one in LRA and > one > later on, after the CAS insn is split. > > The LRA ICE occurs because the > @atomic_compare_and_swap<CCSI:arch><SIDI:mode>_1 pattern attempts > to tie > two output operands together (operands 0 and 1 in the third > alternative). LRA can't handle this, since it doesn't make sense for an > insn to assign to the same operand twice. > > The later (post-splitting) ICE occurs because the expansion of the > cbranchsi4_scratch insn doesn't quite go according to plan. As it > stands, arm_split_compare_and_swap calls gen_cbranchsi4_scratch, > attempting to pass a register (neg_bval) to use as a scratch register. > However, since the RTL template has a match_scratch here, > gen_cbranchsi4_scratch ignores this argument and produces a scratch rtx. > Since this is all happening after RA, this is doomed to fail (and we get > an ICE about the insn not matching its constraints). > > It seems that the motivation for the choice of constraints in the > atomic_compare_and_swap pattern comes from an attempt to satisfy the > constraints of the cbranchsi4_scratch insn. This insn requires the > scratch register to be the same as the input register in the case that > we use a larger negative immediate (one that satisfies J, but not L). > > Of course, as noted above, LRA refuses to assign two output operands to > the same register, so this was never going to work. > > The solution I'm proposing here is to collapse the alternatives to the > CAS insn (allowing the two output register operands to be matched to > different registers) and to ensure that the constraints for > cbranchsi4_scratch are met in arm_split_compare_and_swap. We do this by > inserting a move to ensure the source and destination registers match if > necessary (i.e. in the case of large negative immediates). > > Another notable change here is that we only do: > > emit_move_insn (neg_bval, const1_rtx); > > for non-negative immediates. This is because the ADDS instruction used in > the negative case suffices to leave a suitable value in neg_bval: if the > operands compare equal, we don't take the branch (so neg_bval will be > set by the load exclusive). Otherwise, the ADDS will leave a nonzero > value in neg_bval, which will correctly signal that the CAS has failed > when it is later negated. > > gcc/ChangeLog: > > PR target/99977 > * config/arm/arm.c (arm_split_compare_and_swap): Fix up codegen > with negative immediates: ensure we expand cbranchsi4_scratch > correctly and ensure we satisfy its constraints. > * config/arm/sync.md > (@atomic_compare_and_swap<CCSI:arch><NARROW:mode>_1): Don't > attempt to tie two output operands together with constraints; > collapse two alternatives. > (@atomic_compare_and_swap<CCSI:arch><SIDI:mode>_1): Likewise. > * config/arm/thumb1.md (cbranchsi4_neg_late): New. > > gcc/testsuite/ChangeLog: > > PR target/99977 > * gcc.target/arm/pr99977.c: New test.