On Fri, Jun 30, 2017 at 02:13:39PM +0100, Will Deacon wrote:
> On Fri, Jun 30, 2017 at 05:38:15AM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 30, 2017 at 10:19:29AM +0100, Will Deacon wrote:
> > > On Thu, Jun 29, 2017 at 05:01:16PM -0700, Paul E. McKenney wrote:
> > > > There is no agreed-upon definition of spin_unlock_wait()'s semantics,
> > > > and it appears that all callers could do just as well with a lock/unlock
> > > > pair.  This commit therefore removes spin_unlock_wait() and related
> > > > definitions from core code.
> > > > 
> > > > Signed-off-by: Paul E. McKenney <paul...@linux.vnet.ibm.com>
> > > > Cc: Arnd Bergmann <a...@arndb.de>
> > > > Cc: Ingo Molnar <mi...@redhat.com>
> > > > Cc: Will Deacon <will.dea...@arm.com>
> > > > Cc: Peter Zijlstra <pet...@infradead.org>
> > > > Cc: Alan Stern <st...@rowland.harvard.edu>
> > > > Cc: Andrea Parri <parri.and...@gmail.com>
> > > > Cc: Linus Torvalds <torva...@linux-foundation.org>
> > > > ---
> > > >  include/asm-generic/qspinlock.h |  14 -----
> > > >  include/linux/spinlock.h        |  31 -----------
> > > >  include/linux/spinlock_up.h     |   6 ---
> > > >  kernel/locking/qspinlock.c      | 117 
> > > > ----------------------------------------
> > > >  4 files changed, 168 deletions(-)
> > > 
> > > [...]
> > > 
> > > > diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
> > > > index b2caec7315af..64a9051e4c2c 100644
> > > > --- a/kernel/locking/qspinlock.c
> > > > +++ b/kernel/locking/qspinlock.c
> > > > @@ -267,123 +267,6 @@ static __always_inline u32  
> > > > __pv_wait_head_or_lock(struct qspinlock *lock,
> > > >  #define queued_spin_lock_slowpath      native_queued_spin_lock_slowpath
> > > >  #endif
> > > >  
> > > > -/*
> > > > - * Various notes on spin_is_locked() and spin_unlock_wait(), which are
> > > > - * 'interesting' functions:
> > > > - *
> > > > - * PROBLEM: some architectures have an interesting issue with atomic 
> > > > ACQUIRE
> > > > - * operations in that the ACQUIRE applies to the LOAD _not_ the STORE 
> > > > (ARM64,
> > > > - * PPC). Also qspinlock has a similar issue per construction, the 
> > > > setting of
> > > > - * the locked byte can be unordered acquiring the lock proper.
> > > > - *
> > > > - * This gets to be 'interesting' in the following cases, where the 
> > > > /should/s
> > > > - * end up false because of this issue.
> > > > - *
> > > > - *
> > > > - * CASE 1:
> > > > - *
> > > > - * So the spin_is_locked() correctness issue comes from something like:
> > > > - *
> > > > - *   CPU0                              CPU1
> > > > - *
> > > > - *   global_lock();                    local_lock(i)
> > > > - *     spin_lock(&G)                     spin_lock(&L[i])
> > > > - *     for (i)                           if (!spin_is_locked(&G)) {
> > > > - *       spin_unlock_wait(&L[i]);          
> > > > smp_acquire__after_ctrl_dep();
> > > > - *                                         return;
> > > > - *                                       }
> > > > - *                                       // deal with fail
> > > > - *
> > > > - * Where it is important CPU1 sees G locked or CPU0 sees L[i] locked 
> > > > such
> > > > - * that there is exclusion between the two critical sections.
> > > > - *
> > > > - * The load from spin_is_locked(&G) /should/ be constrained by the 
> > > > ACQUIRE from
> > > > - * spin_lock(&L[i]), and similarly the load(s) from 
> > > > spin_unlock_wait(&L[i])
> > > > - * /should/ be constrained by the ACQUIRE from spin_lock(&G).
> > > > - *
> > > > - * Similarly, later stuff is constrained by the ACQUIRE from CTRL+RMB.
> > > 
> > > Might be worth keeping this comment about spin_is_locked, since we're not
> > > removing that guy just yet!
> > 
> > Ah, all the examples had spin_unlock_wait() in them.  So what I need to
> > do is to create a spin_unlock_wait()-free example to illustrate the
> > text starting with "The load from spin_is_locked(", correct?
> 
> Yeah, I think so.
> 
> > I also need to check all uses of spin_is_locked().  There might no
> > longer be any that rely on any particular ordering...
> 
> Right. I think we're looking for the "insane case" as per 38b850a73034
> (which was apparently used by ipc/sem.c at the time, but no longer).
> 
> There's a usage in kernel/debug/debug_core.c, but it doesn't fill me with
> joy.

That is indeed an interesting one...  But my first round will be what
semantics the implementations seem to provide:

Acquire courtesy of TSO: s390, sparc, x86.
Acquire: ia64 (in reality fully ordered).
Control dependency: alpha, arc, arm, blackfin, hexagon, m32r, mn10300, tile,
        xtensa.
Control dependency plus leading full barrier: arm64, powerpc.
UP-only: c6x, cris, frv, h8300, m68k, microblaze nios2, openrisc, um, unicore32.

Special cases:
        metag: Acquire if !CONFIG_METAG_SMP_WRITE_REORDERING.
               Otherwise control dependency?
        mips: Control dependency, acquire if CONFIG_CPU_CAVIUM_OCTEON.
        parisc: Acquire courtesy of TSO, but why barrier in smp_load_acquire?
        sh: Acquire if one of SH4A, SH5, or J2, otherwise acquire?  UP-only?

Are these correct, or am I missing something with any of them?

                                                        Thanx, Paul

Reply via email to