On Mon, Apr 26, 2021 at 5:46 AM Christoph Muellner <cmuell...@gcc.gnu.org> wrote:
> The existing CAS implementation uses an INSN definition, which provides > the core LR/SC sequence. Additionally to that, there is a follow-up code, > that evaluates the results and calculates the return values. > This has two drawbacks: a) an extension to sub-word CAS implementations > is not possible (even if, then it would be unmaintainable), and b) the > implementation is hard to maintain/improve. > This patch provides a programmatic implementation of CAS, similar > like many other architectures are having one. I noticed that when the address isn't already valid for lr/sc then we end up with extra instructions to fix the address. For instance, using gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c, I get for the lr/sc loop .L2: addi a5,a3,%lo(v) lr.w a1, 0(a5) bne a1,a2,.L7 addi a1,a3,%lo(v) sc.w a5, a0, 0(a1) sext.w a5,a5 bne a5,zero,.L2 and note that there are two addi %lo instructions. The current code gives addi a4,a4,%lo(v) 1: lr.w a2,0(a4); bne a2,a5,1f; sc.w a6,a0,0(a4); bnez a6,1b; 1: which is better, as the address is fixed before the lr/sc loop. The sext is fixed by the REE patch, or by directly generating the sign-extending sc.w so that isn't an issue here. Jim