> -----Original Message-----
> From: Peter Zijlstra <[email protected]>
> Sent: 29 March 2021 16:57
> To: Guo Ren <[email protected]>
> Cc: linux-riscv <[email protected]>; Linux Kernel Mailing List
> <[email protected]>; [email protected]; linux-arch
> <[email protected]>; Guo Ren <[email protected]>; Will
> Deacon <[email protected]>; Ingo Molnar <[email protected]>; Waiman
> Long <[email protected]>; Arnd Bergmann <[email protected]>; Anup
> Patel <[email protected]>
> Subject: Re: [PATCH v4 3/4] locking/qspinlock: Add
> ARCH_USE_QUEUED_SPINLOCKS_XCHG32
>
> On Mon, Mar 29, 2021 at 07:19:29PM +0800, Guo Ren wrote:
> > On Mon, Mar 29, 2021 at 3:50 PM Peter Zijlstra <[email protected]>
> wrote:
> > >
> > > On Sat, Mar 27, 2021 at 06:06:38PM +0000, [email protected] wrote:
> > > > From: Guo Ren <[email protected]>
> > > >
> > > > Some architectures don't have sub-word swap atomic instruction,
> > > > they only have the full word's one.
> > > >
> > > > The sub-word swap only improve the performance when:
> > > > NR_CPUS < 16K
> > > > * 0- 7: locked byte
> > > > * 8: pending
> > > > * 9-15: not used
> > > > * 16-17: tail index
> > > > * 18-31: tail cpu (+1)
> > > >
> > > > The 9-15 bits are wasted to use xchg16 in xchg_tail.
> > > >
> > > > Please let architecture select xchg16/xchg32 to implement
> > > > xchg_tail.
> > >
> > > So I really don't like this, this pushes complexity into the generic
> > > code for something that's really not needed.
> > >
> > > Lots of RISC already implement sub-word atomics using word ll/sc.
> > > Obviously they're not sharing code like they should be :/ See for
> > > example arch/mips/kernel/cmpxchg.c.
> > I see, we've done two versions of this:
> > - Using cmpxchg codes from MIPS by Michael
> > - Re-write with assembly codes by Guo
> >
> > But using the full-word atomic xchg instructions implement xchg16 has
> > the semantic risk for atomic operations.
>
> What? -ENOPARSE
>
> > > Also, I really do think doing ticket locks first is a far more
> > > sensible step.
> > NACK by Anup
>
> Who's he when he's not sending NAKs ?
We had discussions in the RISC-V platforms group about this. Over there,
We had evaluated all spin lock approaches (ticket, qspinlock, etc) tried
in Linux till now. It was concluded in those discussions that eventually we
have to move to qspinlock (even if we moved to ticket spinlock temporarily)
because qspinlock avoids cache line bouncing. Also, moving to qspinlock
will be aligned with other major architectures supported in Linux (such as
x86, ARM64)
Some of the organizations working on high-end RISC-V systems (> 32
CPUs) are interested in having an optimized spinlock implementation
(just like other major architectures x86 and ARM64).
Based on above, Linux RISC-V should move to qspinlock.
Regards,
Anup