subject:"\[PATCH v2\] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks"

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-06 Thread Davidlohr Bueso

On Fri, 2014-06-06 at 17:06 +0200, Peter Zijlstra wrote:
> On Mon, Jun 02, 2014 at 10:09:35AM -0700, Linus Torvalds wrote:
> > On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra  wrote:
> > >
> > > And I can't say I'm a particular fan of these ops either, as alternative
> > > I'm almost inclined to just exclude parisc from using opt spinning.
> > 
> > Please do.
> 
> Something like so; if the rwsem stuff lands in .15 we need more for
> that, it doesn't have a convenient CONFIG symbol like this.
> 
> Linus will you take this from email, or should I get it through
> tip/locking/urgent or so?
> 
> ---
> Subject: locking, mutex: Disable optimistic spinning for PA-RISC
> 
> PA-RISC's cmpxchg is not save against normal stores and the code used
> for optimistic spinning is known broken because of this.
> 
> Disable for now.

I almost hit the send button :)

> 
> Reported-by: Mikulas Patocka 
> Signed-off-by: Peter Zijlstra 
> ---

Reviewed-by: Davidlohr Bueso 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-06 Thread Paul E. McKenney

On Fri, Jun 06, 2014 at 05:06:07PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 02, 2014 at 10:09:35AM -0700, Linus Torvalds wrote:
> > On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra  wrote:
> > >
> > > And I can't say I'm a particular fan of these ops either, as alternative
> > > I'm almost inclined to just exclude parisc from using opt spinning.
> > 
> > Please do.
> 
> Something like so; if the rwsem stuff lands in .15 we need more for
> that, it doesn't have a convenient CONFIG symbol like this.
> 
> Linus will you take this from email, or should I get it through
> tip/locking/urgent or so?
> 
> ---
> Subject: locking, mutex: Disable optimistic spinning for PA-RISC
> 
> PA-RISC's cmpxchg is not save against normal stores and the code used
> for optimistic spinning is known broken because of this.
> 
> Disable for now.
> 
> Reported-by: Mikulas Patocka 
> Signed-off-by: Peter Zijlstra 

Reviewed-by: Paul E. McKenney 

> ---
>  kernel/Kconfig.locks | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks
> index 35536d9c0964..9c239e080c2d 100644
> --- a/kernel/Kconfig.locks
> +++ b/kernel/Kconfig.locks
> @@ -222,7 +222,7 @@ endif
> 
>  config MUTEX_SPIN_ON_OWNER
>   def_bool y
> - depends on SMP && !DEBUG_MUTEXES
> + depends on SMP && !DEBUG_MUTEXES && !PARISC
> 
>  config ARCH_USE_QUEUE_RWLOCK
>   bool
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-06 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 10:09:35AM -0700, Linus Torvalds wrote:
> On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra  wrote:
> >
> > And I can't say I'm a particular fan of these ops either, as alternative
> > I'm almost inclined to just exclude parisc from using opt spinning.
> 
> Please do.

Something like so; if the rwsem stuff lands in .15 we need more for
that, it doesn't have a convenient CONFIG symbol like this.

Linus will you take this from email, or should I get it through
tip/locking/urgent or so?

---
Subject: locking, mutex: Disable optimistic spinning for PA-RISC

PA-RISC's cmpxchg is not save against normal stores and the code used
for optimistic spinning is known broken because of this.

Disable for now.

Reported-by: Mikulas Patocka 
Signed-off-by: Peter Zijlstra 
---
 kernel/Kconfig.locks | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks
index 35536d9c0964..9c239e080c2d 100644
--- a/kernel/Kconfig.locks
+++ b/kernel/Kconfig.locks
@@ -222,7 +222,7 @@ endif
 
 config MUTEX_SPIN_ON_OWNER
def_bool y
-   depends on SMP && !DEBUG_MUTEXES
+   depends on SMP && !DEBUG_MUTEXES && !PARISC
 
 config ARCH_USE_QUEUE_RWLOCK
bool
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-06 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 10:09:35AM -0700, Linus Torvalds wrote:
 On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra pet...@infradead.org wrote:
 
  And I can't say I'm a particular fan of these ops either, as alternative
  I'm almost inclined to just exclude parisc from using opt spinning.
 
 Please do.

Something like so; if the rwsem stuff lands in .15 we need more for
that, it doesn't have a convenient CONFIG symbol like this.

Linus will you take this from email, or should I get it through
tip/locking/urgent or so?

---
Subject: locking, mutex: Disable optimistic spinning for PA-RISC

PA-RISC's cmpxchg is not save against normal stores and the code used
for optimistic spinning is known broken because of this.

Disable for now.

Reported-by: Mikulas Patocka mpato...@redhat.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
 kernel/Kconfig.locks | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks
index 35536d9c0964..9c239e080c2d 100644
--- a/kernel/Kconfig.locks
+++ b/kernel/Kconfig.locks
@@ -222,7 +222,7 @@ endif
 
 config MUTEX_SPIN_ON_OWNER
def_bool y
-   depends on SMP  !DEBUG_MUTEXES
+   depends on SMP  !DEBUG_MUTEXES  !PARISC
 
 config ARCH_USE_QUEUE_RWLOCK
bool
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-06 Thread Paul E. McKenney

On Fri, Jun 06, 2014 at 05:06:07PM +0200, Peter Zijlstra wrote:
 On Mon, Jun 02, 2014 at 10:09:35AM -0700, Linus Torvalds wrote:
  On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra pet...@infradead.org wrote:
  
   And I can't say I'm a particular fan of these ops either, as alternative
   I'm almost inclined to just exclude parisc from using opt spinning.
  
  Please do.
 
 Something like so; if the rwsem stuff lands in .15 we need more for
 that, it doesn't have a convenient CONFIG symbol like this.
 
 Linus will you take this from email, or should I get it through
 tip/locking/urgent or so?
 
 ---
 Subject: locking, mutex: Disable optimistic spinning for PA-RISC
 
 PA-RISC's cmpxchg is not save against normal stores and the code used
 for optimistic spinning is known broken because of this.
 
 Disable for now.
 
 Reported-by: Mikulas Patocka mpato...@redhat.com
 Signed-off-by: Peter Zijlstra pet...@infradead.org

Reviewed-by: Paul E. McKenney paul...@linux.vnet.ibm.com

 ---
  kernel/Kconfig.locks | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks
 index 35536d9c0964..9c239e080c2d 100644
 --- a/kernel/Kconfig.locks
 +++ b/kernel/Kconfig.locks
 @@ -222,7 +222,7 @@ endif
 
  config MUTEX_SPIN_ON_OWNER
   def_bool y
 - depends on SMP  !DEBUG_MUTEXES
 + depends on SMP  !DEBUG_MUTEXES  !PARISC
 
  config ARCH_USE_QUEUE_RWLOCK
   bool
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-06 Thread Davidlohr Bueso

On Fri, 2014-06-06 at 17:06 +0200, Peter Zijlstra wrote:
 On Mon, Jun 02, 2014 at 10:09:35AM -0700, Linus Torvalds wrote:
  On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra pet...@infradead.org wrote:
  
   And I can't say I'm a particular fan of these ops either, as alternative
   I'm almost inclined to just exclude parisc from using opt spinning.
  
  Please do.
 
 Something like so; if the rwsem stuff lands in .15 we need more for
 that, it doesn't have a convenient CONFIG symbol like this.
 
 Linus will you take this from email, or should I get it through
 tip/locking/urgent or so?
 
 ---
 Subject: locking, mutex: Disable optimistic spinning for PA-RISC
 
 PA-RISC's cmpxchg is not save against normal stores and the code used
 for optimistic spinning is known broken because of this.
 
 Disable for now.

I almost hit the send button :)

 
 Reported-by: Mikulas Patocka mpato...@redhat.com
 Signed-off-by: Peter Zijlstra pet...@infradead.org
 ---

Reviewed-by: Davidlohr Bueso davidl...@hp.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 03:55:57PM -0700, Linus Torvalds wrote:
> On Mon, Jun 2, 2014 at 3:08 PM, Paul E. McKenney
>  wrote:
> >
> > rcu: Eliminate read-modify-write ACCESS_ONCE() calls
> >
> > preempt_disable();
> > -   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->c[idx]) += 1;
> > +   lp = this_cpu_ptr(>per_cpu_ref->c[idx]);
> > +   ACCESS_ONCE(*lp) = *lp + 1;
> > smp_mb(); /* B */  /* Avoid leaking the critical section. */
> > -   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->seq[idx]) += 1;
> > +   lp = this_cpu_ptr(>per_cpu_ref->seq[idx]);
> > +   ACCESS_ONCE(*lp) = *lp + 1;
> > preempt_enable();
> > return idx;
> 
> What Eric said. This should just use "this_cpu_inc()" instead.
> Particularly with the smp_mb() and the preempt_enable(), there's no
> way that could/should leak, and the ACCESS_ONCE() seems pointless and
> ugly.
> 
> And the good news is, gcc _will_ generate good code for that.

And here is the update, which passes light rcutorture testing.

Thanx, Paul



rcu: Eliminate read-modify-write ACCESS_ONCE() calls

RCU contains code of the following forms:

ACCESS_ONCE(x)++;
ACCESS_ONCE(x) += y;
ACCESS_ONCE(x) -= y;

Now these constructs do operate correctly, but they really result in a
pair of volatile accesses, one to do the load and another to do the store.
This can be confusing, as the casual reader might well assume that (for
example) gcc might generate a memory-to-memory add instruction for each
of these three cases.  In fact, gcc will do no such thing.  Also, there
is a good chance that the kernel will move to separate load and store
variants of ACCESS_ONCE(), and constructs like the above could easily
confuse both people and scripts attempting to make that sort of change.
Finally, most of RCU's read-modify-write uses of ACCESS_ONCE() really
only need the store to be volatile, so that the read-modify-write form
might be misleading.

This commit therefore changes the above forms in RCU so that each instance
of ACCESS_ONCE() either does a load or a store, but not both.  In a few
cases, ACCESS_ONCE() was not critical, for example, for maintaining
statisitics.  In these cases, ACCESS_ONCE() has been dispensed with
entirely.

Suggested-by: Linus Torvalds 
Signed-off-by: Paul E. McKenney 

diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
index c639556f3fa0..e037f3eb2f7b 100644
--- a/kernel/rcu/srcu.c
+++ b/kernel/rcu/srcu.c
@@ -298,9 +298,9 @@ int __srcu_read_lock(struct srcu_struct *sp)
 
idx = ACCESS_ONCE(sp->completed) & 0x1;
preempt_disable();
-   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->c[idx]) += 1;
+   __this_cpu_inc(sp->per_cpu_ref->c[idx]);
smp_mb(); /* B */  /* Avoid leaking the critical section. */
-   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->seq[idx]) += 1;
+   __this_cpu_inc(sp->per_cpu_ref->seq[idx]);
preempt_enable();
return idx;
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d1c8e4a85b92..f0ed867070cd 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2275,7 +2275,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct 
rcu_data *rdp)
}
smp_mb(); /* List handling before counting for rcu_barrier(). */
rdp->qlen_lazy -= count_lazy;
-   ACCESS_ONCE(rdp->qlen) -= count;
+   ACCESS_ONCE(rdp->qlen) = rdp->qlen - count;
rdp->n_cbs_invoked += count;
 
/* Reinstate batch limit if we have worked down the excess. */
@@ -2420,7 +2420,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
if (rnp_old != NULL)
raw_spin_unlock(_old->fqslock);
if (ret) {
-   ACCESS_ONCE(rsp->n_force_qs_lh)++;
+   rsp->n_force_qs_lh++;
return;
}
rnp_old = rnp;
@@ -2432,7 +2432,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
smp_mb__after_unlock_lock();
raw_spin_unlock(_old->fqslock);
if (ACCESS_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) {
-   ACCESS_ONCE(rsp->n_force_qs_lh)++;
+   rsp->n_force_qs_lh++;
raw_spin_unlock_irqrestore(_old->lock, flags);
return;  /* Someone beat us to it. */
}
@@ -2621,7 +2621,7 @@ __call_rcu(struct rcu_head *head, void (*func)(struct 
rcu_head *rcu),
local_irq_restore(flags);
return;
}
-   ACCESS_ONCE(rdp->qlen)++;
+   ACCESS_ONCE(rdp->qlen) = rdp->qlen + 1;
if (lazy)
rdp->qlen_lazy++;
else
@@ -3185,7 +3185,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 * ACCESS_ONCE() to prevent the compiler from speculating
 * the increment to precede the early-exit check.
 */
-

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Paul E. McKenney

On Tue, Jun 03, 2014 at 05:09:08PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 03, 2014 at 07:07:27AM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 03, 2014 at 09:36:13AM +0200, Peter Zijlstra wrote:
> > > 
> > > #ifdef __CHECKER__
> > > #define __atomic  __attribute__((address_space(5)))
> > > #else
> > > #define __atomic
> > > #endif
> > > 
> > > #define store(p, v)   (*(p) = (typeof(*(p)) __force __atomic)(v))
> > > #define load(p)   ((typeof(*p) __force)ACCESS_ONCE(*(p)))
> > > 
> > > Along with changes to xchg() and cmpxchg() that require them to take
> > > pointers to __atomic.
> > > 
> > > That way we keep the flexibility of xchg() and cmpxchg() for being
> > > (mostly) type and size invariant, and get sparse to find wrong usage.
> > > 
> > > Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
> > > store() however they like.
> > 
> > Should be fun interacting with atomic operations on __rcu variables
> > (address space 4).  Of course, that is already fun...
> > 
> 
> Hmm, good point, I suppose sparse doesn't like two different
> address_space annotations on the same variable ?
> 
> /me adds Christpoher Li to the CC list.
> 
> ISTR Mikulas actually listing one such, me digs in recent email..
> 
> > $ grep -w "fdt->fd" */*.c
> > fs/file.c:  free_fdmem(fdt->fd);
> > fs/file.c:  fdt->fd = data;
> > fs/file.c:  free_fdmem(fdt->fd);
> > fs/file.c:  struct file * file = 
> > xchg(>fd[i], NULL);
> 
> So yes, that's going to be fun, mostly because rcu_assign_pointer()
> doesn't actually do the right magic for this to be safe on their
> platform(s).

Maybe at some point sparse needs to keep a bit mask for the address
spaces, so that you caould say somthing like:

struct foo __atomic __rcu *p;

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Peter Zijlstra

On Tue, Jun 03, 2014 at 07:07:27AM -0700, Paul E. McKenney wrote:
> On Tue, Jun 03, 2014 at 09:36:13AM +0200, Peter Zijlstra wrote:
> > 
> > #ifdef __CHECKER__
> > #define __atomic__attribute__((address_space(5)))
> > #else
> > #define __atomic
> > #endif
> > 
> > #define store(p, v) (*(p) = (typeof(*(p)) __force __atomic)(v))
> > #define load(p) ((typeof(*p) __force)ACCESS_ONCE(*(p)))
> > 
> > Along with changes to xchg() and cmpxchg() that require them to take
> > pointers to __atomic.
> > 
> > That way we keep the flexibility of xchg() and cmpxchg() for being
> > (mostly) type and size invariant, and get sparse to find wrong usage.
> > 
> > Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
> > store() however they like.
> 
> Should be fun interacting with atomic operations on __rcu variables
> (address space 4).  Of course, that is already fun...
> 

Hmm, good point, I suppose sparse doesn't like two different
address_space annotations on the same variable ?

/me adds Christpoher Li to the CC list.

ISTR Mikulas actually listing one such, me digs in recent email..

> $ grep -w "fdt->fd" */*.c
> fs/file.c:  free_fdmem(fdt->fd);
> fs/file.c:  fdt->fd = data;
> fs/file.c:  free_fdmem(fdt->fd);
> fs/file.c:  struct file * file = 
> xchg(>fd[i], NULL);

So yes, that's going to be fun, mostly because rcu_assign_pointer()
doesn't actually do the right magic for this to be safe on their
platform(s).




pgp77GFJ3vRtq.pgp
Description: PGP signature

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Paul E. McKenney

On Tue, Jun 03, 2014 at 09:36:13AM +0200, Peter Zijlstra wrote:
> On Mon, Jun 02, 2014 at 05:12:16PM -0400, Mikulas Patocka wrote:
> > The patch adds atomic_pointer_t for all architectures - it is in the 
> > common code and it is backed by atomic_long_t (that already exists for all 
> > architectures). There is no new arch-specific code at all.
> > 
> > When we have atomic_pointer_t, we can find the instances of xchg() and 
> > cmpxchg() and convert them to atomic_pointer_t (or to other atomic*_t 
> > types).
> > 
> > When we convert them all, we can drop xchg() and cmpxchg() at all (at 
> > least from architecture-neutral code).
> > 
> > The problem with xchg() and cmpxchg() is that they are very easy to 
> > misuse. Peter Zijlstra didn't know that they are not atomic w.r.t. normal 
> > stores, a lot of other people don't know it too - and if we allow these 
> > functions to be used, this race condition will reappear in the future 
> > again and again.
> > 
> > That's why I'm proposing atomic_pointer_t - it guarantees that this race 
> > condition can't be made.
> 
> But its horrible, and doesn't have any benefit afaict.
> 
> So if we really want to keep supporting these platforms; I would propose
> something like:
> 
> #ifdef __CHECKER__
> #define __atomic  __attribute__((address_space(5)))
> #else
> #define __atomic
> #endif
> 
> #define store(p, v)   (*(p) = (typeof(*(p)) __force __atomic)(v))
> #define load(p)   ((typeof(*p) __force)ACCESS_ONCE(*(p)))
> 
> Along with changes to xchg() and cmpxchg() that require them to take
> pointers to __atomic.
> 
> That way we keep the flexibility of xchg() and cmpxchg() for being
> (mostly) type and size invariant, and get sparse to find wrong usage.
> 
> Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
> store() however they like.

Should be fun interacting with atomic operations on __rcu variables
(address space 4).  Of course, that is already fun...

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Mikulas Patocka



On Tue, 3 Jun 2014, Peter Zijlstra wrote:

> On Tue, Jun 03, 2014 at 07:14:31AM -0400, Mikulas Patocka wrote:
> > > So if we really want to keep supporting these platforms; I would propose
> > > something like:
> > > 
> > > #ifdef __CHECKER__
> > > #define __atomic  __attribute__((address_space(5)))
> > > #else
> > > #define __atomic
> > > #endif
> > > 
> > > #define store(p, v)   (*(p) = (typeof(*(p)) __force __atomic)(v))
> > > #define load(p)   ((typeof(*p) __force)ACCESS_ONCE(*(p)))
> > > 
> > > Along with changes to xchg() and cmpxchg() that require them to take
> > > pointers to __atomic.
> > > 
> > > That way we keep the flexibility of xchg() and cmpxchg() for being
> > > (mostly) type and size invariant, and get sparse to find wrong usage.
> > > 
> > > Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
> > > store() however they like.
> > 
> > Your proposal is very good because it warns about incorrect usage 
> > automatically.
> 
> Exactly the point.
> 
> > Your usage is very similar to what my patch at the top of this thread 
> > does:
> > 
> > Instead of "__atomic struct s *p;" declaration, my patch uses
> > "atomic_pointer(struct s*) p;" as the declaration
> > Instead of store(, v), my patch uses atomic_pointer_set(, v);
> > Instead of load(), my patch uses atomic_pointer_get();
> > Instead of xchg(, v), my patch uses atomic_pointer_xchg(, v);
> > Instead of cmpxchg(, v1, v2), my patch uses atomic_pointer_cmpxchg(, 
> > v1, v2);
> > 
> > > But its horrible, and doesn't have any benefit afaict.
> > 
> > See the five cases above - why do you say that the operation on the left 
> > is good and the operation on the right is horrible? To me, it looks like 
> > they are both similar, they are just named differently. Both check the 
> > type of the pointer and warns if the user passes incompatible pointer. If 
> > I rename the operations in my patch to store(), load(), xchg(), cmpxchg(), 
> > would you like it?
> 
> Nope.. because the above store,load,xchg,cmpxchg are type invariant and
> work for anything of size (1),2,4,(8).
> 
> So I dislike your proposal on a number of points:
> 
>  1) its got pointer in, and while the immediate problem is indeed with
>  pointers, there is no reason it always should be, so we'll keep on
>  introducing new APIs;
> 
>  2) its got a fixed length, nl. sizeof(void *), if we were to find
>  another case which had the same problem which used 'int' we'd have to
>  again create new APIs;
> 
>  3) you only fixed the one site;
> 
>  4) I'm the lazy kind and atomic_foo_* is just too much typing, let
>  alone remembering all the various new atomic_foo_ APIs resulting from
>  all this.
> 
> This is the place where I really miss C++ templates; and yes before
> people shoot me in the head for that, I do know about all the various
> pitfalls and down sides of those too.
> 
> > My patch has advantage (over your #define __atomic 
> > __attribute__((address_space(5))) ) that it checks the mismatches at 
> > compile time. Your proposal only check them with sparse. But either way - 
> > it is very good that the mismatches are being checked automatically.
> 
> So my proposal goes a lot further in that by making xchg() and cmpxchg()
> require pointer to __atomic, all sites get coverage, not only the one
> case where you found was a problem.
> 
> Yes, this requires a lot more effort, for we'll have to pretty much
> audit and annotate the entire tree, but such things can be done, see for
> example the introduction of __rcu.
> 
> Also, these days we get automagic emails if we introduce new sparse
> fails, so it being sparse and not gcc isn't really any threshold at all.
> 
> > We need some method to catch these races automatically. There are places 
> > where people xchg() or cmpxchg() with direct modifications, see for 
> > example this:
> 
> Yep, so all those places will immediately stand out, the first fail will
> be that those variables aren't marked __atomic, once you do that, the
> direct assignment will complain about crossing the address_space marker.
> 
> Voila, sorted.

I originally wanted to remove PA-RISC xchg and cmpxchg, force compile 
failure on places where it is used and convert them to atomic operations. 
But there's a lot of such places, the patch would be big and it would 
probably trigger some compile failures in configurations that I can't 
test.

So, I agree that your approch with sparse tagging is better, it only warns 
about unsafe use and it won't be breaking compilation for so many people.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Peter Zijlstra

On Tue, Jun 03, 2014 at 07:14:31AM -0400, Mikulas Patocka wrote:
> > So if we really want to keep supporting these platforms; I would propose
> > something like:
> > 
> > #ifdef __CHECKER__
> > #define __atomic__attribute__((address_space(5)))
> > #else
> > #define __atomic
> > #endif
> > 
> > #define store(p, v) (*(p) = (typeof(*(p)) __force __atomic)(v))
> > #define load(p) ((typeof(*p) __force)ACCESS_ONCE(*(p)))
> > 
> > Along with changes to xchg() and cmpxchg() that require them to take
> > pointers to __atomic.
> > 
> > That way we keep the flexibility of xchg() and cmpxchg() for being
> > (mostly) type and size invariant, and get sparse to find wrong usage.
> > 
> > Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
> > store() however they like.
> 
> Your proposal is very good because it warns about incorrect usage 
> automatically.

Exactly the point.

> Your usage is very similar to what my patch at the top of this thread 
> does:
> 
> Instead of "__atomic struct s *p;" declaration, my patch uses
> "atomic_pointer(struct s*) p;" as the declaration
> Instead of store(, v), my patch uses atomic_pointer_set(, v);
> Instead of load(), my patch uses atomic_pointer_get();
> Instead of xchg(, v), my patch uses atomic_pointer_xchg(, v);
> Instead of cmpxchg(, v1, v2), my patch uses atomic_pointer_cmpxchg(, v1, 
> v2);
> 
> > But its horrible, and doesn't have any benefit afaict.
> 
> See the five cases above - why do you say that the operation on the left 
> is good and the operation on the right is horrible? To me, it looks like 
> they are both similar, they are just named differently. Both check the 
> type of the pointer and warns if the user passes incompatible pointer. If 
> I rename the operations in my patch to store(), load(), xchg(), cmpxchg(), 
> would you like it?

Nope.. because the above store,load,xchg,cmpxchg are type invariant and
work for anything of size (1),2,4,(8).

So I dislike your proposal on a number of points:

 1) its got pointer in, and while the immediate problem is indeed with
 pointers, there is no reason it always should be, so we'll keep on
 introducing new APIs;

 2) its got a fixed length, nl. sizeof(void *), if we were to find
 another case which had the same problem which used 'int' we'd have to
 again create new APIs;

 3) you only fixed the one site;

 4) I'm the lazy kind and atomic_foo_* is just too much typing, let
 alone remembering all the various new atomic_foo_ APIs resulting from
 all this.

This is the place where I really miss C++ templates; and yes before
people shoot me in the head for that, I do know about all the various
pitfalls and down sides of those too.

> My patch has advantage (over your #define __atomic 
> __attribute__((address_space(5))) ) that it checks the mismatches at 
> compile time. Your proposal only check them with sparse. But either way - 
> it is very good that the mismatches are being checked automatically.

So my proposal goes a lot further in that by making xchg() and cmpxchg()
require pointer to __atomic, all sites get coverage, not only the one
case where you found was a problem.

Yes, this requires a lot more effort, for we'll have to pretty much
audit and annotate the entire tree, but such things can be done, see for
example the introduction of __rcu.

Also, these days we get automagic emails if we introduce new sparse
fails, so it being sparse and not gcc isn't really any threshold at all.

> We need some method to catch these races automatically. There are places 
> where people xchg() or cmpxchg() with direct modifications, see for 
> example this:

Yep, so all those places will immediately stand out, the first fail will
be that those variables aren't marked __atomic, once you do that, the
direct assignment will complain about crossing the address_space marker.

Voila, sorted.

pgpEpv0OIzBRP.pgp
Description: PGP signature

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Mikulas Patocka



On Tue, 3 Jun 2014, Peter Zijlstra wrote:

> On Mon, Jun 02, 2014 at 05:12:16PM -0400, Mikulas Patocka wrote:
> > The patch adds atomic_pointer_t for all architectures - it is in the 
> > common code and it is backed by atomic_long_t (that already exists for all 
> > architectures). There is no new arch-specific code at all.
> > 
> > When we have atomic_pointer_t, we can find the instances of xchg() and 
> > cmpxchg() and convert them to atomic_pointer_t (or to other atomic*_t 
> > types).
> > 
> > When we convert them all, we can drop xchg() and cmpxchg() at all (at 
> > least from architecture-neutral code).
> > 
> > The problem with xchg() and cmpxchg() is that they are very easy to 
> > misuse. Peter Zijlstra didn't know that they are not atomic w.r.t. normal 
> > stores, a lot of other people don't know it too - and if we allow these 
> > functions to be used, this race condition will reappear in the future 
> > again and again.
> > 
> > That's why I'm proposing atomic_pointer_t - it guarantees that this race 
> > condition can't be made.
> 
> But its horrible, and doesn't have any benefit afaict.
>
> So if we really want to keep supporting these platforms; I would propose
> something like:
> 
> #ifdef __CHECKER__
> #define __atomic  __attribute__((address_space(5)))
> #else
> #define __atomic
> #endif
> 
> #define store(p, v)   (*(p) = (typeof(*(p)) __force __atomic)(v))
> #define load(p)   ((typeof(*p) __force)ACCESS_ONCE(*(p)))
> 
> Along with changes to xchg() and cmpxchg() that require them to take
> pointers to __atomic.
> 
> That way we keep the flexibility of xchg() and cmpxchg() for being
> (mostly) type and size invariant, and get sparse to find wrong usage.
> 
> Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
> store() however they like.

Your proposal is very good because it warns about incorrect usage 
automatically.



Your usage is very similar to what my patch at the top of this thread 
does:

Instead of "__atomic struct s *p;" declaration, my patch uses
"atomic_pointer(struct s*) p;" as the declaration
Instead of store(, v), my patch uses atomic_pointer_set(, v);
Instead of load(), my patch uses atomic_pointer_get();
Instead of xchg(, v), my patch uses atomic_pointer_xchg(, v);
Instead of cmpxchg(, v1, v2), my patch uses atomic_pointer_cmpxchg(, v1, 
v2);

> But its horrible, and doesn't have any benefit afaict.

See the five cases above - why do you say that the operation on the left 
is good and the operation on the right is horrible? To me, it looks like 
they are both similar, they are just named differently. Both check the 
type of the pointer and warns if the user passes incompatible pointer. If 
I rename the operations in my patch to store(), load(), xchg(), cmpxchg(), 
would you like it?


My patch has advantage (over your #define __atomic 
__attribute__((address_space(5))) ) that it checks the mismatches at 
compile time. Your proposal only check them with sparse. But either way - 
it is very good that the mismatches are being checked automatically.




We need some method to catch these races automatically. There are places 
where people xchg() or cmpxchg() with direct modifications, see for 
example this:


$ grep -w mnt_expiry_mark */*.c
fs/namespace.c: if (unlikely(m->mnt_expiry_mark))
fs/namespace.c: m->mnt_expiry_mark = 0;
fs/namespace.c: if (!xchg(>mnt_expiry_mark, 1))
fs/namespace.c: /* we mustn't call path_put() as that would clear 
mnt_expiry_mark */
fs/namespace.c: if (!xchg(>mnt_expiry_mark, 1) ||

$ grep "sub_info->complete" */*.c
kernel/kmod.c:  struct completion *comp = xchg(_info->complete, NULL);
kernel/kmod.c:  sub_info->complete = 
kernel/kmod.c:  if (xchg(_info->complete, NULL))

$ grep -w "fdt->fd" */*.c
fs/file.c:  free_fdmem(fdt->fd);
fs/file.c:  fdt->fd = data;
fs/file.c:  free_fdmem(fdt->fd);
fs/file.c:  struct file * file = xchg(>fd[i], 
NULL);
fs/file.c:  if (rcu_access_pointer(fdt->fd[fd]) != NULL) {
fs/file.c:  rcu_assign_pointer(fdt->fd[fd], NULL);
fs/file.c:  BUG_ON(fdt->fd[fd] != NULL);
fs/file.c:  rcu_assign_pointer(fdt->fd[fd], file);
fs/file.c:  file = fdt->fd[fd];
fs/file.c:  rcu_assign_pointer(fdt->fd[fd], NULL);
fs/file.c:  file = fdt->fd[fd];
fs/file.c:  rcu_assign_pointer(fdt->fd[fd], NULL);
fs/file.c:  tofree = fdt->fd[fd];
fs/file.c:  rcu_assign_pointer(fdt->fd[fd], file);
fs/file.c:  file = rcu_dereference_check_fdtable(files, fdt->fd[n]);

$ grep -w exit_state */*.c
fs/exec.c:  if (likely(leader->exit_state))
fs/exec.c:  BUG_ON(leader->exit_state != EXIT_ZOMBIE);
fs/exec.c:  leader->exit_state = EXIT_DEAD;
kernel/exit.c:  if (leader != p && thread_group_empty(leader) && 
leader->exit_state == EXIT_ZOMBIE) {
kernel/exit.c:  leader->exit_state = EXIT_DEAD;

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 01:22:10PM -0700, Linus Torvalds wrote:
> On Mon, Jun 2, 2014 at 1:05 PM, Peter Zijlstra  wrote:
> >
> > So the question is, do you prefer subtly broken code or hard compile
> > fails? Me, I go for the compile fail.
> 
> The thing is, parisc has a perfectly fine "cmpxchg" implementation in
> practice, and ACCESS_ONCE() and friends work fine too for reading.
> 
> What the "use a spinlock" approach cannot generally do is:
> 
>  - ACCESS_ONCE() to _write_ things doesn't work well. You really
> should use "atomic_set()".
> 
>  - you may not necessarily be able to mix partial updates (ie
> differently sized updates to the same thing) depending on just how the
> spinlock hashing works
> 
> but both of those are really rare issues and don't affect normal code.

Agreed on the second, although that would be fairly easy to fix by
masking out the lower few bits in the pointer address before hashing.

The first, you're probably, right, but seeing how its a completely
silent fail atm I'm not at all comfortable with it.


pgpF0N0klizvb.pgp
Description: PGP signature

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 05:12:16PM -0400, Mikulas Patocka wrote:
> The patch adds atomic_pointer_t for all architectures - it is in the 
> common code and it is backed by atomic_long_t (that already exists for all 
> architectures). There is no new arch-specific code at all.
> 
> When we have atomic_pointer_t, we can find the instances of xchg() and 
> cmpxchg() and convert them to atomic_pointer_t (or to other atomic*_t 
> types).
> 
> When we convert them all, we can drop xchg() and cmpxchg() at all (at 
> least from architecture-neutral code).
> 
> The problem with xchg() and cmpxchg() is that they are very easy to 
> misuse. Peter Zijlstra didn't know that they are not atomic w.r.t. normal 
> stores, a lot of other people don't know it too - and if we allow these 
> functions to be used, this race condition will reappear in the future 
> again and again.
> 
> That's why I'm proposing atomic_pointer_t - it guarantees that this race 
> condition can't be made.

But its horrible, and doesn't have any benefit afaict.

So if we really want to keep supporting these platforms; I would propose
something like:

#ifdef __CHECKER__
#define __atomic__attribute__((address_space(5)))
#else
#define __atomic
#endif

#define store(p, v) (*(p) = (typeof(*(p)) __force __atomic)(v))
#define load(p) ((typeof(*p) __force)ACCESS_ONCE(*(p)))

Along with changes to xchg() and cmpxchg() that require them to take
pointers to __atomic.

That way we keep the flexibility of xchg() and cmpxchg() for being
(mostly) type and size invariant, and get sparse to find wrong usage.

Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
store() however they like.




pgpfCW04r6dWY.pgp
Description: PGP signature

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 04:46:32PM -0400, Mikulas Patocka wrote:
> It's not only parisc - tile32, arc, metag (maybe hexagon) are broken too, 
> because they don't have cmpxchg in hardware.

metag actually does, and the lock1 thing is a fallback/test thing:

config METAG_ATOMICITY_LOCK1
depends on SMP
bool "lock1"
help
  This option uses the LOCK1 instruction for atomicity. This is mainly
  provided as a debugging aid if the lnkget/lnkset atomicity primitive
  isn't working properly.

Then again, metag has qualiteee bits like:

config METAG_SMP_WRITE_REORDERING
bool
help
  This attempts to prevent cache-memory incoherence due to external
  reordering of writes from different hardware threads when SMP is
  enabled. It adds fences (system event 0) to smp_mb and smp_rmb in an
  attempt to catch some of the cases, and also before writes to shared
  memory in LOCK1 protected atomics and spinlocks.
  This will not completely prevent cache incoherency on affected cores.

Which makes me back away slowly before starting to run.

And there is sane arc hardware:

config ARC_HAS_LLSC
bool "Insn: LLOCK/SCOND (efficient atomic ops)"
default y
depends on ARC_CPU_770 && !ARC_CANT_LLSC

tile32 is indeed equally wrecked, but at least they have a tile64 system
that is useful (albeit somewhat strange).

Same for sparc32, there's sparc64 which is sane.

parisc otoh is a dead arch that never got sane.

pgpTTpsARwENk.pgp
Description: PGP signature

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 04:46:32PM -0400, Mikulas Patocka wrote:
 It's not only parisc - tile32, arc, metag (maybe hexagon) are broken too, 
 because they don't have cmpxchg in hardware.

metag actually does, and the lock1 thing is a fallback/test thing:

config METAG_ATOMICITY_LOCK1
depends on SMP
bool lock1
help
  This option uses the LOCK1 instruction for atomicity. This is mainly
  provided as a debugging aid if the lnkget/lnkset atomicity primitive
  isn't working properly.

Then again, metag has qualiteee bits like:

config METAG_SMP_WRITE_REORDERING
bool
help
  This attempts to prevent cache-memory incoherence due to external
  reordering of writes from different hardware threads when SMP is
  enabled. It adds fences (system event 0) to smp_mb and smp_rmb in an
  attempt to catch some of the cases, and also before writes to shared
  memory in LOCK1 protected atomics and spinlocks.
  This will not completely prevent cache incoherency on affected cores.

Which makes me back away slowly before starting to run.

And there is sane arc hardware:

config ARC_HAS_LLSC
bool Insn: LLOCK/SCOND (efficient atomic ops)
default y
depends on ARC_CPU_770  !ARC_CANT_LLSC

tile32 is indeed equally wrecked, but at least they have a tile64 system
that is useful (albeit somewhat strange).

Same for sparc32, there's sparc64 which is sane.

parisc otoh is a dead arch that never got sane.


pgpTTpsARwENk.pgp
Description: PGP signature

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 05:12:16PM -0400, Mikulas Patocka wrote:
 The patch adds atomic_pointer_t for all architectures - it is in the 
 common code and it is backed by atomic_long_t (that already exists for all 
 architectures). There is no new arch-specific code at all.
 
 When we have atomic_pointer_t, we can find the instances of xchg() and 
 cmpxchg() and convert them to atomic_pointer_t (or to other atomic*_t 
 types).
 
 When we convert them all, we can drop xchg() and cmpxchg() at all (at 
 least from architecture-neutral code).
 
 The problem with xchg() and cmpxchg() is that they are very easy to 
 misuse. Peter Zijlstra didn't know that they are not atomic w.r.t. normal 
 stores, a lot of other people don't know it too - and if we allow these 
 functions to be used, this race condition will reappear in the future 
 again and again.
 
 That's why I'm proposing atomic_pointer_t - it guarantees that this race 
 condition can't be made.

But its horrible, and doesn't have any benefit afaict.

So if we really want to keep supporting these platforms; I would propose
something like:

#ifdef __CHECKER__
#define __atomic__attribute__((address_space(5)))
#else
#define __atomic
#endif

#define store(p, v) (*(p) = (typeof(*(p)) __force __atomic)(v))
#define load(p) ((typeof(*p) __force)ACCESS_ONCE(*(p)))

Along with changes to xchg() and cmpxchg() that require them to take
pointers to __atomic.

That way we keep the flexibility of xchg() and cmpxchg() for being
(mostly) type and size invariant, and get sparse to find wrong usage.

Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
store() however they like.




pgpfCW04r6dWY.pgp
Description: PGP signature

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 01:22:10PM -0700, Linus Torvalds wrote:
 On Mon, Jun 2, 2014 at 1:05 PM, Peter Zijlstra pet...@infradead.org wrote:
 
  So the question is, do you prefer subtly broken code or hard compile
  fails? Me, I go for the compile fail.
 
 The thing is, parisc has a perfectly fine cmpxchg implementation in
 practice, and ACCESS_ONCE() and friends work fine too for reading.
 
 What the use a spinlock approach cannot generally do is:
 
  - ACCESS_ONCE() to _write_ things doesn't work well. You really
 should use atomic_set().
 
  - you may not necessarily be able to mix partial updates (ie
 differently sized updates to the same thing) depending on just how the
 spinlock hashing works
 
 but both of those are really rare issues and don't affect normal code.

Agreed on the second, although that would be fairly easy to fix by
masking out the lower few bits in the pointer address before hashing.

The first, you're probably, right, but seeing how its a completely
silent fail atm I'm not at all comfortable with it.


pgpF0N0klizvb.pgp
Description: PGP signature

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Mikulas Patocka



On Tue, 3 Jun 2014, Peter Zijlstra wrote:

 On Mon, Jun 02, 2014 at 05:12:16PM -0400, Mikulas Patocka wrote:
  The patch adds atomic_pointer_t for all architectures - it is in the 
  common code and it is backed by atomic_long_t (that already exists for all 
  architectures). There is no new arch-specific code at all.
  
  When we have atomic_pointer_t, we can find the instances of xchg() and 
  cmpxchg() and convert them to atomic_pointer_t (or to other atomic*_t 
  types).
  
  When we convert them all, we can drop xchg() and cmpxchg() at all (at 
  least from architecture-neutral code).
  
  The problem with xchg() and cmpxchg() is that they are very easy to 
  misuse. Peter Zijlstra didn't know that they are not atomic w.r.t. normal 
  stores, a lot of other people don't know it too - and if we allow these 
  functions to be used, this race condition will reappear in the future 
  again and again.
  
  That's why I'm proposing atomic_pointer_t - it guarantees that this race 
  condition can't be made.
 
 But its horrible, and doesn't have any benefit afaict.

 So if we really want to keep supporting these platforms; I would propose
 something like:
 
 #ifdef __CHECKER__
 #define __atomic  __attribute__((address_space(5)))
 #else
 #define __atomic
 #endif
 
 #define store(p, v)   (*(p) = (typeof(*(p)) __force __atomic)(v))
 #define load(p)   ((typeof(*p) __force)ACCESS_ONCE(*(p)))
 
 Along with changes to xchg() and cmpxchg() that require them to take
 pointers to __atomic.
 
 That way we keep the flexibility of xchg() and cmpxchg() for being
 (mostly) type and size invariant, and get sparse to find wrong usage.
 
 Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
 store() however they like.

Your proposal is very good because it warns about incorrect usage 
automatically.



Your usage is very similar to what my patch at the top of this thread 
does:

Instead of __atomic struct s *p; declaration, my patch uses
atomic_pointer(struct s*) p; as the declaration
Instead of store(p, v), my patch uses atomic_pointer_set(p, v);
Instead of load(p), my patch uses atomic_pointer_get(p);
Instead of xchg(p, v), my patch uses atomic_pointer_xchg(p, v);
Instead of cmpxchg(p, v1, v2), my patch uses atomic_pointer_cmpxchg(p1, v1, 
v2);

 But its horrible, and doesn't have any benefit afaict.

See the five cases above - why do you say that the operation on the left 
is good and the operation on the right is horrible? To me, it looks like 
they are both similar, they are just named differently. Both check the 
type of the pointer and warns if the user passes incompatible pointer. If 
I rename the operations in my patch to store(), load(), xchg(), cmpxchg(), 
would you like it?


My patch has advantage (over your #define __atomic 
__attribute__((address_space(5))) ) that it checks the mismatches at 
compile time. Your proposal only check them with sparse. But either way - 
it is very good that the mismatches are being checked automatically.




We need some method to catch these races automatically. There are places 
where people xchg() or cmpxchg() with direct modifications, see for 
example this:


$ grep -w mnt_expiry_mark */*.c
fs/namespace.c: if (unlikely(m-mnt_expiry_mark))
fs/namespace.c: m-mnt_expiry_mark = 0;
fs/namespace.c: if (!xchg(mnt-mnt_expiry_mark, 1))
fs/namespace.c: /* we mustn't call path_put() as that would clear 
mnt_expiry_mark */
fs/namespace.c: if (!xchg(mnt-mnt_expiry_mark, 1) ||

$ grep sub_info-complete */*.c
kernel/kmod.c:  struct completion *comp = xchg(sub_info-complete, NULL);
kernel/kmod.c:  sub_info-complete = done;
kernel/kmod.c:  if (xchg(sub_info-complete, NULL))

$ grep -w fdt-fd */*.c
fs/file.c:  free_fdmem(fdt-fd);
fs/file.c:  fdt-fd = data;
fs/file.c:  free_fdmem(fdt-fd);
fs/file.c:  struct file * file = xchg(fdt-fd[i], 
NULL);
fs/file.c:  if (rcu_access_pointer(fdt-fd[fd]) != NULL) {
fs/file.c:  rcu_assign_pointer(fdt-fd[fd], NULL);
fs/file.c:  BUG_ON(fdt-fd[fd] != NULL);
fs/file.c:  rcu_assign_pointer(fdt-fd[fd], file);
fs/file.c:  file = fdt-fd[fd];
fs/file.c:  rcu_assign_pointer(fdt-fd[fd], NULL);
fs/file.c:  file = fdt-fd[fd];
fs/file.c:  rcu_assign_pointer(fdt-fd[fd], NULL);
fs/file.c:  tofree = fdt-fd[fd];
fs/file.c:  rcu_assign_pointer(fdt-fd[fd], file);
fs/file.c:  file = rcu_dereference_check_fdtable(files, fdt-fd[n]);

$ grep -w exit_state */*.c
fs/exec.c:  if (likely(leader-exit_state))
fs/exec.c:  BUG_ON(leader-exit_state != EXIT_ZOMBIE);
fs/exec.c:  leader-exit_state = EXIT_DEAD;
kernel/exit.c:  if (leader != p  thread_group_empty(leader)  
leader-exit_state == EXIT_ZOMBIE) {
kernel/exit.c:  leader-exit_state = EXIT_DEAD;
kernel/exit.c:  (p-exit_state  thread_group_empty(p)) ||

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Peter Zijlstra

On Tue, Jun 03, 2014 at 07:14:31AM -0400, Mikulas Patocka wrote:
  So if we really want to keep supporting these platforms; I would propose
  something like:
  
  #ifdef __CHECKER__
  #define __atomic__attribute__((address_space(5)))
  #else
  #define __atomic
  #endif
  
  #define store(p, v) (*(p) = (typeof(*(p)) __force __atomic)(v))
  #define load(p) ((typeof(*p) __force)ACCESS_ONCE(*(p)))
  
  Along with changes to xchg() and cmpxchg() that require them to take
  pointers to __atomic.
  
  That way we keep the flexibility of xchg() and cmpxchg() for being
  (mostly) type and size invariant, and get sparse to find wrong usage.
  
  Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
  store() however they like.
 
 Your proposal is very good because it warns about incorrect usage 
 automatically.

Exactly the point.

 Your usage is very similar to what my patch at the top of this thread 
 does:
 
 Instead of __atomic struct s *p; declaration, my patch uses
 atomic_pointer(struct s*) p; as the declaration
 Instead of store(p, v), my patch uses atomic_pointer_set(p, v);
 Instead of load(p), my patch uses atomic_pointer_get(p);
 Instead of xchg(p, v), my patch uses atomic_pointer_xchg(p, v);
 Instead of cmpxchg(p, v1, v2), my patch uses atomic_pointer_cmpxchg(p1, v1, 
 v2);
 
  But its horrible, and doesn't have any benefit afaict.
 
 See the five cases above - why do you say that the operation on the left 
 is good and the operation on the right is horrible? To me, it looks like 
 they are both similar, they are just named differently. Both check the 
 type of the pointer and warns if the user passes incompatible pointer. If 
 I rename the operations in my patch to store(), load(), xchg(), cmpxchg(), 
 would you like it?

Nope.. because the above store,load,xchg,cmpxchg are type invariant and
work for anything of size (1),2,4,(8).

So I dislike your proposal on a number of points:

 1) its got pointer in, and while the immediate problem is indeed with
 pointers, there is no reason it always should be, so we'll keep on
 introducing new APIs;

 2) its got a fixed length, nl. sizeof(void *), if we were to find
 another case which had the same problem which used 'int' we'd have to
 again create new APIs;

 3) you only fixed the one site;

 4) I'm the lazy kind and atomic_foo_* is just too much typing, let
 alone remembering all the various new atomic_foo_ APIs resulting from
 all this.

This is the place where I really miss C++ templates; and yes before
people shoot me in the head for that, I do know about all the various
pitfalls and down sides of those too.

 My patch has advantage (over your #define __atomic 
 __attribute__((address_space(5))) ) that it checks the mismatches at 
 compile time. Your proposal only check them with sparse. But either way - 
 it is very good that the mismatches are being checked automatically.

So my proposal goes a lot further in that by making xchg() and cmpxchg()
require pointer to __atomic, all sites get coverage, not only the one
case where you found was a problem.

Yes, this requires a lot more effort, for we'll have to pretty much
audit and annotate the entire tree, but such things can be done, see for
example the introduction of __rcu.

Also, these days we get automagic emails if we introduce new sparse
fails, so it being sparse and not gcc isn't really any threshold at all.

 We need some method to catch these races automatically. There are places 
 where people xchg() or cmpxchg() with direct modifications, see for 
 example this:

Yep, so all those places will immediately stand out, the first fail will
be that those variables aren't marked __atomic, once you do that, the
direct assignment will complain about crossing the address_space marker.

Voila, sorted.


pgpEpv0OIzBRP.pgp
Description: PGP signature

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Mikulas Patocka



On Tue, 3 Jun 2014, Peter Zijlstra wrote:

 On Tue, Jun 03, 2014 at 07:14:31AM -0400, Mikulas Patocka wrote:
   So if we really want to keep supporting these platforms; I would propose
   something like:
   
   #ifdef __CHECKER__
   #define __atomic  __attribute__((address_space(5)))
   #else
   #define __atomic
   #endif
   
   #define store(p, v)   (*(p) = (typeof(*(p)) __force __atomic)(v))
   #define load(p)   ((typeof(*p) __force)ACCESS_ONCE(*(p)))
   
   Along with changes to xchg() and cmpxchg() that require them to take
   pointers to __atomic.
   
   That way we keep the flexibility of xchg() and cmpxchg() for being
   (mostly) type and size invariant, and get sparse to find wrong usage.
   
   Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
   store() however they like.
  
  Your proposal is very good because it warns about incorrect usage 
  automatically.
 
 Exactly the point.
 
  Your usage is very similar to what my patch at the top of this thread 
  does:
  
  Instead of __atomic struct s *p; declaration, my patch uses
  atomic_pointer(struct s*) p; as the declaration
  Instead of store(p, v), my patch uses atomic_pointer_set(p, v);
  Instead of load(p), my patch uses atomic_pointer_get(p);
  Instead of xchg(p, v), my patch uses atomic_pointer_xchg(p, v);
  Instead of cmpxchg(p, v1, v2), my patch uses atomic_pointer_cmpxchg(p1, 
  v1, v2);
  
   But its horrible, and doesn't have any benefit afaict.
  
  See the five cases above - why do you say that the operation on the left 
  is good and the operation on the right is horrible? To me, it looks like 
  they are both similar, they are just named differently. Both check the 
  type of the pointer and warns if the user passes incompatible pointer. If 
  I rename the operations in my patch to store(), load(), xchg(), cmpxchg(), 
  would you like it?
 
 Nope.. because the above store,load,xchg,cmpxchg are type invariant and
 work for anything of size (1),2,4,(8).
 
 So I dislike your proposal on a number of points:
 
  1) its got pointer in, and while the immediate problem is indeed with
  pointers, there is no reason it always should be, so we'll keep on
  introducing new APIs;
 
  2) its got a fixed length, nl. sizeof(void *), if we were to find
  another case which had the same problem which used 'int' we'd have to
  again create new APIs;
 
  3) you only fixed the one site;
 
  4) I'm the lazy kind and atomic_foo_* is just too much typing, let
  alone remembering all the various new atomic_foo_ APIs resulting from
  all this.
 
 This is the place where I really miss C++ templates; and yes before
 people shoot me in the head for that, I do know about all the various
 pitfalls and down sides of those too.
 
  My patch has advantage (over your #define __atomic 
  __attribute__((address_space(5))) ) that it checks the mismatches at 
  compile time. Your proposal only check them with sparse. But either way - 
  it is very good that the mismatches are being checked automatically.
 
 So my proposal goes a lot further in that by making xchg() and cmpxchg()
 require pointer to __atomic, all sites get coverage, not only the one
 case where you found was a problem.
 
 Yes, this requires a lot more effort, for we'll have to pretty much
 audit and annotate the entire tree, but such things can be done, see for
 example the introduction of __rcu.
 
 Also, these days we get automagic emails if we introduce new sparse
 fails, so it being sparse and not gcc isn't really any threshold at all.
 
  We need some method to catch these races automatically. There are places 
  where people xchg() or cmpxchg() with direct modifications, see for 
  example this:
 
 Yep, so all those places will immediately stand out, the first fail will
 be that those variables aren't marked __atomic, once you do that, the
 direct assignment will complain about crossing the address_space marker.
 
 Voila, sorted.

I originally wanted to remove PA-RISC xchg and cmpxchg, force compile 
failure on places where it is used and convert them to atomic operations. 
But there's a lot of such places, the patch would be big and it would 
probably trigger some compile failures in configurations that I can't 
test.

So, I agree that your approch with sparse tagging is better, it only warns 
about unsafe use and it won't be breaking compilation for so many people.

Mikulas
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Paul E. McKenney

On Tue, Jun 03, 2014 at 09:36:13AM +0200, Peter Zijlstra wrote:
 On Mon, Jun 02, 2014 at 05:12:16PM -0400, Mikulas Patocka wrote:
  The patch adds atomic_pointer_t for all architectures - it is in the 
  common code and it is backed by atomic_long_t (that already exists for all 
  architectures). There is no new arch-specific code at all.
  
  When we have atomic_pointer_t, we can find the instances of xchg() and 
  cmpxchg() and convert them to atomic_pointer_t (or to other atomic*_t 
  types).
  
  When we convert them all, we can drop xchg() and cmpxchg() at all (at 
  least from architecture-neutral code).
  
  The problem with xchg() and cmpxchg() is that they are very easy to 
  misuse. Peter Zijlstra didn't know that they are not atomic w.r.t. normal 
  stores, a lot of other people don't know it too - and if we allow these 
  functions to be used, this race condition will reappear in the future 
  again and again.
  
  That's why I'm proposing atomic_pointer_t - it guarantees that this race 
  condition can't be made.
 
 But its horrible, and doesn't have any benefit afaict.
 
 So if we really want to keep supporting these platforms; I would propose
 something like:
 
 #ifdef __CHECKER__
 #define __atomic  __attribute__((address_space(5)))
 #else
 #define __atomic
 #endif
 
 #define store(p, v)   (*(p) = (typeof(*(p)) __force __atomic)(v))
 #define load(p)   ((typeof(*p) __force)ACCESS_ONCE(*(p)))
 
 Along with changes to xchg() and cmpxchg() that require them to take
 pointers to __atomic.
 
 That way we keep the flexibility of xchg() and cmpxchg() for being
 (mostly) type and size invariant, and get sparse to find wrong usage.
 
 Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
 store() however they like.

Should be fun interacting with atomic operations on __rcu variables
(address space 4).  Of course, that is already fun...

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Peter Zijlstra

On Tue, Jun 03, 2014 at 07:07:27AM -0700, Paul E. McKenney wrote:
 On Tue, Jun 03, 2014 at 09:36:13AM +0200, Peter Zijlstra wrote:
  
  #ifdef __CHECKER__
  #define __atomic__attribute__((address_space(5)))
  #else
  #define __atomic
  #endif
  
  #define store(p, v) (*(p) = (typeof(*(p)) __force __atomic)(v))
  #define load(p) ((typeof(*p) __force)ACCESS_ONCE(*(p)))
  
  Along with changes to xchg() and cmpxchg() that require them to take
  pointers to __atomic.
  
  That way we keep the flexibility of xchg() and cmpxchg() for being
  (mostly) type and size invariant, and get sparse to find wrong usage.
  
  Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
  store() however they like.
 
 Should be fun interacting with atomic operations on __rcu variables
 (address space 4).  Of course, that is already fun...
 

Hmm, good point, I suppose sparse doesn't like two different
address_space annotations on the same variable ?

/me adds Christpoher Li to the CC list.

ISTR Mikulas actually listing one such, me digs in recent email..

 $ grep -w fdt-fd */*.c
 fs/file.c:  free_fdmem(fdt-fd);
 fs/file.c:  fdt-fd = data;
 fs/file.c:  free_fdmem(fdt-fd);
 fs/file.c:  struct file * file = 
 xchg(fdt-fd[i], NULL);

So yes, that's going to be fun, mostly because rcu_assign_pointer()
doesn't actually do the right magic for this to be safe on their
platform(s).




pgp77GFJ3vRtq.pgp
Description: PGP signature

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Paul E. McKenney

On Tue, Jun 03, 2014 at 05:09:08PM +0200, Peter Zijlstra wrote:
 On Tue, Jun 03, 2014 at 07:07:27AM -0700, Paul E. McKenney wrote:
  On Tue, Jun 03, 2014 at 09:36:13AM +0200, Peter Zijlstra wrote:
   
   #ifdef __CHECKER__
   #define __atomic  __attribute__((address_space(5)))
   #else
   #define __atomic
   #endif
   
   #define store(p, v)   (*(p) = (typeof(*(p)) __force __atomic)(v))
   #define load(p)   ((typeof(*p) __force)ACCESS_ONCE(*(p)))
   
   Along with changes to xchg() and cmpxchg() that require them to take
   pointers to __atomic.
   
   That way we keep the flexibility of xchg() and cmpxchg() for being
   (mostly) type and size invariant, and get sparse to find wrong usage.
   
   Then parisc, sparc32, tile32, metag-lock1 and arc-!llsc can go implement
   store() however they like.
  
  Should be fun interacting with atomic operations on __rcu variables
  (address space 4).  Of course, that is already fun...
  
 
 Hmm, good point, I suppose sparse doesn't like two different
 address_space annotations on the same variable ?
 
 /me adds Christpoher Li to the CC list.
 
 ISTR Mikulas actually listing one such, me digs in recent email..
 
  $ grep -w fdt-fd */*.c
  fs/file.c:  free_fdmem(fdt-fd);
  fs/file.c:  fdt-fd = data;
  fs/file.c:  free_fdmem(fdt-fd);
  fs/file.c:  struct file * file = 
  xchg(fdt-fd[i], NULL);
 
 So yes, that's going to be fun, mostly because rcu_assign_pointer()
 doesn't actually do the right magic for this to be safe on their
 platform(s).

Maybe at some point sparse needs to keep a bit mask for the address
spaces, so that you caould say somthing like:

struct foo __atomic __rcu *p;

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-03 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 03:55:57PM -0700, Linus Torvalds wrote:
 On Mon, Jun 2, 2014 at 3:08 PM, Paul E. McKenney
 paul...@linux.vnet.ibm.com wrote:
 
  rcu: Eliminate read-modify-write ACCESS_ONCE() calls
 
  preempt_disable();
  -   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-c[idx]) += 1;
  +   lp = this_cpu_ptr(sp-per_cpu_ref-c[idx]);
  +   ACCESS_ONCE(*lp) = *lp + 1;
  smp_mb(); /* B */  /* Avoid leaking the critical section. */
  -   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-seq[idx]) += 1;
  +   lp = this_cpu_ptr(sp-per_cpu_ref-seq[idx]);
  +   ACCESS_ONCE(*lp) = *lp + 1;
  preempt_enable();
  return idx;
 
 What Eric said. This should just use this_cpu_inc() instead.
 Particularly with the smp_mb() and the preempt_enable(), there's no
 way that could/should leak, and the ACCESS_ONCE() seems pointless and
 ugly.
 
 And the good news is, gcc _will_ generate good code for that.

And here is the update, which passes light rcutorture testing.

Thanx, Paul



rcu: Eliminate read-modify-write ACCESS_ONCE() calls

RCU contains code of the following forms:

ACCESS_ONCE(x)++;
ACCESS_ONCE(x) += y;
ACCESS_ONCE(x) -= y;

Now these constructs do operate correctly, but they really result in a
pair of volatile accesses, one to do the load and another to do the store.
This can be confusing, as the casual reader might well assume that (for
example) gcc might generate a memory-to-memory add instruction for each
of these three cases.  In fact, gcc will do no such thing.  Also, there
is a good chance that the kernel will move to separate load and store
variants of ACCESS_ONCE(), and constructs like the above could easily
confuse both people and scripts attempting to make that sort of change.
Finally, most of RCU's read-modify-write uses of ACCESS_ONCE() really
only need the store to be volatile, so that the read-modify-write form
might be misleading.

This commit therefore changes the above forms in RCU so that each instance
of ACCESS_ONCE() either does a load or a store, but not both.  In a few
cases, ACCESS_ONCE() was not critical, for example, for maintaining
statisitics.  In these cases, ACCESS_ONCE() has been dispensed with
entirely.

Suggested-by: Linus Torvalds torva...@linux-foundation.org
Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com

diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
index c639556f3fa0..e037f3eb2f7b 100644
--- a/kernel/rcu/srcu.c
+++ b/kernel/rcu/srcu.c
@@ -298,9 +298,9 @@ int __srcu_read_lock(struct srcu_struct *sp)
 
idx = ACCESS_ONCE(sp-completed)  0x1;
preempt_disable();
-   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-c[idx]) += 1;
+   __this_cpu_inc(sp-per_cpu_ref-c[idx]);
smp_mb(); /* B */  /* Avoid leaking the critical section. */
-   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-seq[idx]) += 1;
+   __this_cpu_inc(sp-per_cpu_ref-seq[idx]);
preempt_enable();
return idx;
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d1c8e4a85b92..f0ed867070cd 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2275,7 +2275,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct 
rcu_data *rdp)
}
smp_mb(); /* List handling before counting for rcu_barrier(). */
rdp-qlen_lazy -= count_lazy;
-   ACCESS_ONCE(rdp-qlen) -= count;
+   ACCESS_ONCE(rdp-qlen) = rdp-qlen - count;
rdp-n_cbs_invoked += count;
 
/* Reinstate batch limit if we have worked down the excess. */
@@ -2420,7 +2420,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
if (rnp_old != NULL)
raw_spin_unlock(rnp_old-fqslock);
if (ret) {
-   ACCESS_ONCE(rsp-n_force_qs_lh)++;
+   rsp-n_force_qs_lh++;
return;
}
rnp_old = rnp;
@@ -2432,7 +2432,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
smp_mb__after_unlock_lock();
raw_spin_unlock(rnp_old-fqslock);
if (ACCESS_ONCE(rsp-gp_flags)  RCU_GP_FLAG_FQS) {
-   ACCESS_ONCE(rsp-n_force_qs_lh)++;
+   rsp-n_force_qs_lh++;
raw_spin_unlock_irqrestore(rnp_old-lock, flags);
return;  /* Someone beat us to it. */
}
@@ -2621,7 +2621,7 @@ __call_rcu(struct rcu_head *head, void (*func)(struct 
rcu_head *rcu),
local_irq_restore(flags);
return;
}
-   ACCESS_ONCE(rdp-qlen)++;
+   ACCESS_ONCE(rdp-qlen) = rdp-qlen + 1;
if (lazy)
rdp-qlen_lazy++;
else
@@ -3185,7 +3185,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 * ACCESS_ONCE() to prevent the compiler from speculating
 * the increment to precede the early-exit check.

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 04:53:44PM -0700, Eric Dumazet wrote:
> On Mon, 2014-06-02 at 16:17 -0700, Paul E. McKenney wrote:
> 
> > But given that I already have preemption disabled and given that
> > __srcu_read_lock() is not to be used by irq handlers, I should be able to
> > use __this_cpu_inc(), correct?  Just to avoid unnecessary irq disabling
> > on non-x86 platforms...
> 
> Absolutely, __this_cpu_inc() is OK here.

Cool, giving it a test...

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Eric Dumazet

On Mon, 2014-06-02 at 16:17 -0700, Paul E. McKenney wrote:

> But given that I already have preemption disabled and given that
> __srcu_read_lock() is not to be used by irq handlers, I should be able to
> use __this_cpu_inc(), correct?  Just to avoid unnecessary irq disabling
> on non-x86 platforms...

Absolutely, __this_cpu_inc() is OK here.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 03:44:48PM -0700, Eric Dumazet wrote:
> On Mon, 2014-06-02 at 15:08 -0700, Paul E. McKenney wrote:
> 
> > diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
> > index c639556f3fa0..c0120279dead 100644
> > --- a/kernel/rcu/srcu.c
> > +++ b/kernel/rcu/srcu.c
> > @@ -295,12 +295,15 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
> >  int __srcu_read_lock(struct srcu_struct *sp)
> >  {
> > int idx;
> > +   unsigned long *lp;
> >  
> > idx = ACCESS_ONCE(sp->completed) & 0x1;
> > preempt_disable();
> > -   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->c[idx]) += 1;
> > +   lp = this_cpu_ptr(>per_cpu_ref->c[idx]);
> > +   ACCESS_ONCE(*lp) = *lp + 1;
> > smp_mb(); /* B */  /* Avoid leaking the critical section. */
> > -   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->seq[idx]) += 1;
> > +   lp = this_cpu_ptr(>per_cpu_ref->seq[idx]);
> > +   ACCESS_ONCE(*lp) = *lp + 1;
> > preempt_enable();
> > return idx;
> >  
> 
> This probably could use the following 
> 
> diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
> index c639556f3fa0..3a97eb6f9076 100644
> --- a/kernel/rcu/srcu.c
> +++ b/kernel/rcu/srcu.c
> @@ -298,9 +298,9 @@ int __srcu_read_lock(struct srcu_struct *sp)
> 
>   idx = ACCESS_ONCE(sp->completed) & 0x1;
>   preempt_disable();
> - ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->c[idx]) += 1;
> + this_cpu_inc(sp->per_cpu_ref->c[idx]);
>   smp_mb(); /* B */  /* Avoid leaking the critical section. */
> - ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->seq[idx]) += 1;
> + this_cpu_inc(sp->per_cpu_ref->seq[idx]);
>   preempt_enable();
>   return idx;
>  }

Good point!

But given that I already have preemption disabled and given that
__srcu_read_lock() is not to be used by irq handlers, I should be able to
use __this_cpu_inc(), correct?  Just to avoid unnecessary irq disabling
on non-x86 platforms...

Seems to pass a quick build, so trying a bit heavier testing.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Linus Torvalds

On Mon, Jun 2, 2014 at 3:08 PM, Paul E. McKenney
 wrote:
>
> rcu: Eliminate read-modify-write ACCESS_ONCE() calls
>
> preempt_disable();
> -   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->c[idx]) += 1;
> +   lp = this_cpu_ptr(>per_cpu_ref->c[idx]);
> +   ACCESS_ONCE(*lp) = *lp + 1;
> smp_mb(); /* B */  /* Avoid leaking the critical section. */
> -   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->seq[idx]) += 1;
> +   lp = this_cpu_ptr(>per_cpu_ref->seq[idx]);
> +   ACCESS_ONCE(*lp) = *lp + 1;
> preempt_enable();
> return idx;

What Eric said. This should just use "this_cpu_inc()" instead.
Particularly with the smp_mb() and the preempt_enable(), there's no
way that could/should leak, and the ACCESS_ONCE() seems pointless and
ugly.

And the good news is, gcc _will_ generate good code for that.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Eric Dumazet

On Mon, 2014-06-02 at 15:08 -0700, Paul E. McKenney wrote:

> diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
> index c639556f3fa0..c0120279dead 100644
> --- a/kernel/rcu/srcu.c
> +++ b/kernel/rcu/srcu.c
> @@ -295,12 +295,15 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
>  int __srcu_read_lock(struct srcu_struct *sp)
>  {
>   int idx;
> + unsigned long *lp;
>  
>   idx = ACCESS_ONCE(sp->completed) & 0x1;
>   preempt_disable();
> - ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->c[idx]) += 1;
> + lp = this_cpu_ptr(>per_cpu_ref->c[idx]);
> + ACCESS_ONCE(*lp) = *lp + 1;
>   smp_mb(); /* B */  /* Avoid leaking the critical section. */
> - ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->seq[idx]) += 1;
> + lp = this_cpu_ptr(>per_cpu_ref->seq[idx]);
> + ACCESS_ONCE(*lp) = *lp + 1;
>   preempt_enable();
>   return idx;
>  

This probably could use the following 

diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
index c639556f3fa0..3a97eb6f9076 100644
--- a/kernel/rcu/srcu.c
+++ b/kernel/rcu/srcu.c
@@ -298,9 +298,9 @@ int __srcu_read_lock(struct srcu_struct *sp)
 
idx = ACCESS_ONCE(sp->completed) & 0x1;
preempt_disable();
-   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->c[idx]) += 1;
+   this_cpu_inc(sp->per_cpu_ref->c[idx]);
smp_mb(); /* B */  /* Avoid leaking the critical section. */
-   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->seq[idx]) += 1;
+   this_cpu_inc(sp->per_cpu_ref->seq[idx]);
preempt_enable();
return idx;
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 02:12:30PM -0700, Linus Torvalds wrote:
> On Mon, Jun 2, 2014 at 2:02 PM, Paul E. McKenney
>  wrote:
> >
> > In the ->qlen case, interrupts are disabled and the current CPU is
> > the only one who can write, so the read need not be volatile.  In the
> > ->n_barrier_done, modifications are done holding ->barrier_mutex, so again
> > the read need not be volatile.  In the sync_rcu_preempt_exp_count case,
> > modifications are done holding sync_rcu_preempt_exp_mutex, so once again,
> > the read need not be volatile.  So I could do something like:
> >
> > ACCESS_ONCE(rdp->qlen) = rdp->qlen + 1;
> >
> > But that still makes gcc generate bad code
> >
> > The reason I was not all that worried about this is that these are not
> > in fastpaths, and the last two are especially not in fastpaths.
> >
> > Suggestions?
> 
> So I think it probably *works*, but even so splitting it up to use
> ACCESS_ONCE() on just the write is probably a better option, if only
> because it would then make it much easier to change if we do end up
> splitting reads and writes.
> 
> Because from a gcc code generation standpoint, using "volatile" will
> always be horrible, because gcc will never be able to turn it into a
> read-modify-write cycle. Arguable gcc _should_ be able to do that (it
> is certainly allowable within the virtual machine definition), but I
> understand why it doesn't ("volatile? Let's not optimize anything at
> all, because it's special").
> 
> So "ACCESS_ONCE() + R-M-W" operation is actually pretty much
> guaranteed to be "ACCESS_TWICE()", which may well be ok (performance
> may not matter, and even when it does most architectures don't
> actually have r-m-w instructions and when they do they aren't always
> even faster), but I think it's just horribly horribly bad from a
> conceptual and readability standpoint because it's so misleading.
> 
> So I'd actually rather see two explicit ACCESS_ONCE() calls - once to
> read, once to write. Because that at least describes what is
> happening, unlike the current situation.
> 
> Put another way: I can understand why you do it, and I can even agree
> that it is "correct" from a functionality standpoint. But even despite
> that all, I really don't like the construct very much..

OK, I have queued the following commit for 3.17.  Is this what you had
in mind?

Thanx, Paul



rcu: Eliminate read-modify-write ACCESS_ONCE() calls

RCU contains code of the following forms:

ACCESS_ONCE(x)++;
ACCESS_ONCE(x) += y;
ACCESS_ONCE(x) -= y;

Now these constructs do operate correctly, but they really result in a
pair of volatile accesses, one to do the load and another to do the store.
This can be confusing, as the casual reader might well assume that (for
example) gcc might generate a memory-to-memory add instruction for each
of these three cases.  In fact, gcc will do no such thing.  Also, there
is a good chance that the kernel will move to separate load and store
variants of ACCESS_ONCE(), and constructs like the above could easily
confuse both people and scripts attempting to make that sort of change.
Finally, most of RCU's read-modify-write uses of ACCESS_ONCE() really
only need the store to be volatile, so that the read-modify-write form
might be misleading.

This commit therefore changes the above forms in RCU so that each instance
of ACCESS_ONCE() either does a load or a store, but not both.  In a few
cases, ACCESS_ONCE() was not critical, for example, for maintaining 
statisitics.  In these cases, ACCESS_ONCE() has been dispensed with
entirely

Suggested-by: Linus Torvalds 
Signed-off-by: Paul E. McKenney 

diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
index c639556f3fa0..c0120279dead 100644
--- a/kernel/rcu/srcu.c
+++ b/kernel/rcu/srcu.c
@@ -295,12 +295,15 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
 int __srcu_read_lock(struct srcu_struct *sp)
 {
int idx;
+   unsigned long *lp;
 
idx = ACCESS_ONCE(sp->completed) & 0x1;
preempt_disable();
-   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->c[idx]) += 1;
+   lp = this_cpu_ptr(>per_cpu_ref->c[idx]);
+   ACCESS_ONCE(*lp) = *lp + 1;
smp_mb(); /* B */  /* Avoid leaking the critical section. */
-   ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->seq[idx]) += 1;
+   lp = this_cpu_ptr(>per_cpu_ref->seq[idx]);
+   ACCESS_ONCE(*lp) = *lp + 1;
preempt_enable();
return idx;
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d1c8e4a85b92..f0ed867070cd 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2275,7 +2275,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct 
rcu_data *rdp)
}
smp_mb(); /* List handling before counting for rcu_barrier(). */
rdp->qlen_lazy -= count_lazy;
-   ACCESS_ONCE(rdp->qlen) -= count;
+   ACCESS_ONCE(rdp->qlen) =

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Mikulas Patocka

On Mon, 2 Jun 2014, Linus Torvalds wrote:

> On Mon, Jun 2, 2014 at 1:46 PM, Mikulas Patocka  wrote:
> >
> > And what else do you want to do?
> >
> > Peter Zijlstra said "I've been using xchg() and cmpxchg() without such
> > consideration for quite a while." - so it basically implies that the
> > kernel is full of such races, mcs_spinlock is just the most visible one
> > that crashes the kernel first.
> 
> .. so your whole argument is bogus, because it doesn't actually fix
> anything else.
>
> Now, something that *would* fix something else is (for example) to
> just make "ACCESS_ONCE()" a rvalue so that you cannot use it for
> assignments, and then trying to sort out what happens then. It's
> possible that the "atomic_pointer_t" would be a part of the solution
> to that "what happens then", but THERE IS NO WAY IN HELL we're adding
> it for just one architecture and one use that doesn't warrant even
> _existing_ on that architecture.

The patch adds atomic_pointer_t for all architectures - it is in the 
common code and it is backed by atomic_long_t (that already exists for all 
architectures). There is no new arch-specific code at all.

When we have atomic_pointer_t, we can find the instances of xchg() and 
cmpxchg() and convert them to atomic_pointer_t (or to other atomic*_t 
types).

When we convert them all, we can drop xchg() and cmpxchg() at all (at 
least from architecture-neutral code).

The problem with xchg() and cmpxchg() is that they are very easy to 
misuse. Peter Zijlstra didn't know that they are not atomic w.r.t. normal 
stores, a lot of other people don't know it too - and if we allow these 
functions to be used, this race condition will reappear in the future 
again and again.

That's why I'm proposing atomic_pointer_t - it guarantees that this race 
condition can't be made.

> See what I'm saying?
> 
> You're not fixing the problem, you're fixing one unimportant detail
> that isn't worth fixing that way.
> 
>Linus

Regarding reworking ACCESS_ONCE() for reads and writes - the problem is - 
how do you make people use it? ACCESS_ONCE() is already missing at a lot 
of places (it doesn't cause any visible bug on the condition that the 
compiler doesn't split the load or store to multiple accesses), I can 
assume that people will omit ATOMIC_ONCE_STORE() too even if we make it.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Linus Torvalds

On Mon, Jun 2, 2014 at 2:02 PM, Paul E. McKenney
 wrote:
>
> In the ->qlen case, interrupts are disabled and the current CPU is
> the only one who can write, so the read need not be volatile.  In the
> ->n_barrier_done, modifications are done holding ->barrier_mutex, so again
> the read need not be volatile.  In the sync_rcu_preempt_exp_count case,
> modifications are done holding sync_rcu_preempt_exp_mutex, so once again,
> the read need not be volatile.  So I could do something like:
>
> ACCESS_ONCE(rdp->qlen) = rdp->qlen + 1;
>
> But that still makes gcc generate bad code
>
> The reason I was not all that worried about this is that these are not
> in fastpaths, and the last two are especially not in fastpaths.
>
> Suggestions?

So I think it probably *works*, but even so splitting it up to use
ACCESS_ONCE() on just the write is probably a better option, if only
because it would then make it much easier to change if we do end up
splitting reads and writes.

Because from a gcc code generation standpoint, using "volatile" will
always be horrible, because gcc will never be able to turn it into a
read-modify-write cycle. Arguable gcc _should_ be able to do that (it
is certainly allowable within the virtual machine definition), but I
understand why it doesn't ("volatile? Let's not optimize anything at
all, because it's special").

So "ACCESS_ONCE() + R-M-W" operation is actually pretty much
guaranteed to be "ACCESS_TWICE()", which may well be ok (performance
may not matter, and even when it does most architectures don't
actually have r-m-w instructions and when they do they aren't always
even faster), but I think it's just horribly horribly bad from a
conceptual and readability standpoint because it's so misleading.

So I'd actually rather see two explicit ACCESS_ONCE() calls - once to
read, once to write. Because that at least describes what is
happening, unlike the current situation.

Put another way: I can understand why you do it, and I can even agree
that it is "correct" from a functionality standpoint. But even despite
that all, I really don't like the construct very much..

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 01:22:10PM -0700, Linus Torvalds wrote:
> On Mon, Jun 2, 2014 at 1:05 PM, Peter Zijlstra  wrote:
> >
> > So the question is, do you prefer subtly broken code or hard compile
> > fails? Me, I go for the compile fail.
> 
> The thing is, parisc has a perfectly fine "cmpxchg" implementation in
> practice, and ACCESS_ONCE() and friends work fine too for reading.
> 
> What the "use a spinlock" approach cannot generally do is:
> 
>  - ACCESS_ONCE() to _write_ things doesn't work well. You really
> should use "atomic_set()".
> 
>  - you may not necessarily be able to mix partial updates (ie
> differently sized updates to the same thing) depending on just how the
> spinlock hashing works
> 
> but both of those are really rare issues and don't affect normal code.
> 
> I would not necessarily be opposed to splitting up ACCESS_ONCE() for
> reading and for writing, and maybe we could do something special for
> the writing path (which tends to be less ctitical). It's really mixing
> "ACCESS_ONCE(x)" to _set_ a value, together with atomic ops to update
> it, that ends up being problematic.

Knowing what I know now about how ACCESS_ONCE() is used, I would have
split it for reading and writing to begin with.  Where is that time
machine when you need it?  ;-)

> Maybe there are other issues I can't think of right now. But
> basically, parisc _can_ do cmpxchg, it's just that the code needs to
> be somewhat sanitized.
> 
> Side note: some of the RCU code uses "ACCESS_ONCE()" for
> read-modify-write code, which is just f*cking crazy. The semantics are
> dubious, and it generally makes gcc create bad code too.

A couple of the places are admittedly overkill, for example the pair of:

ACCESS_ONCE(rsp->n_force_qs_lh)++;

which is just for debug statistics in any case.  I could put these back
to "rsp->n_force_qs_lh++;" without problems, if desired.  (Yeah, I know,
I am overly paranoid.)

However, these cases need a bit more care:

ACCESS_ONCE(rdp->qlen)++;
ACCESS_ONCE(rsp->n_barrier_done)++;
ACCESS_ONCE(sync_rcu_preempt_exp_count)++;

In the ->qlen case, interrupts are disabled and the current CPU is
the only one who can write, so the read need not be volatile.  In the
->n_barrier_done, modifications are done holding ->barrier_mutex, so again
the read need not be volatile.  In the sync_rcu_preempt_exp_count case,
modifications are done holding sync_rcu_preempt_exp_mutex, so once again,
the read need not be volatile.  So I could do something like:

ACCESS_ONCE(rdp->qlen) = rdp->qlen + 1;

But that still makes gcc generate bad code.

The reason I was not all that worried about this is that these are not
in fastpaths, and the last two are especially not in fastpaths.

Suggestions?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 10:09:35AM -0700, Linus Torvalds wrote:
> On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra  wrote:
> >
> > And I can't say I'm a particular fan of these ops either, as alternative
> > I'm almost inclined to just exclude parisc from using opt spinning.
> 
> Please do.
> 
> There is no way in hell that we should introduce a magic new
> atomic_pointer thing for parisc. And the idea somebody had to change
> ACCESS_ONCE() for PA-RISC (I'm not going to go back to find who to
> blame) is just horribly wrong too, since it's not even necessary for
> any normal use: the special "load-and-store-zero" instruction isn't
> actually used for "real" data, it's used only for the special
> spinlocks afaik, so doing it for all ACCESS_ONCE() users would be
> wrong even on PA-RISC. For any normal data, the usual "just load the
> value, making sure the compiler doesn't reload it" is perfectly fine -
> even on PA-RISC.

Guilty to charges as read on suggesting PA-RISC-specific ACCESS_ONCE().  ;-)

Thanx, Paul

> Now, if PA-RISC was a major architecture, we'd have to figure this
> out. But as it is, PA-RISC is just about the shittiest RISC ever
> invented (with original sparc being a strong contender), and let's
> face it, nobody really uses it.  It's a "fun project", but it is not
> something that we should use to mess up either ACCESS_ONCE() or the
> MCS locks.
> 
> Just make PA-RISC use its own locks, not any of the new fancy ones.
> 
>   Linus
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Linus Torvalds

On Mon, Jun 2, 2014 at 1:46 PM, Mikulas Patocka  wrote:
>
> And what else do you want to do?
>
> Peter Zijlstra said "I've been using xchg() and cmpxchg() without such
> consideration for quite a while." - so it basically implies that the
> kernel is full of such races, mcs_spinlock is just the most visible one
> that crashes the kernel first.

.. so your whole argument is bogus, because it doesn't actually fix
anything else.

Now, something that *would* fix something else is (for example) to
just make "ACCESS_ONCE()" a rvalue so that you cannot use it for
assignments, and then trying to sort out what happens then. It's
possible that the "atomic_pointer_t" would be a part of the solution
to that "what happens then", but THERE IS NO WAY IN HELL we're adding
it for just one architecture and one use that doesn't warrant even
_existing_ on that architecture.

See what I'm saying?

You're not fixing the problem, you're fixing one unimportant detail
that isn't worth fixing that way.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Mikulas Patocka

On Mon, 2 Jun 2014, Linus Torvalds wrote:

> On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra  wrote:
> >
> > And I can't say I'm a particular fan of these ops either, as alternative
> > I'm almost inclined to just exclude parisc from using opt spinning.
> 
> Please do.
> 
> There is no way in hell that we should introduce a magic new
> atomic_pointer thing for parisc. And the idea somebody had to change
> ACCESS_ONCE() for PA-RISC (I'm not going to go back to find who to
> blame) is just horribly wrong too, since it's not even necessary for
> any normal use: the special "load-and-store-zero" instruction isn't
> actually used for "real" data, it's used only for the special
> spinlocks afaik, so doing it for all ACCESS_ONCE() users would be
> wrong even on PA-RISC. For any normal data, the usual "just load the
> value, making sure the compiler doesn't reload it" is perfectly fine -
> even on PA-RISC.
> 
> Now, if PA-RISC was a major architecture, we'd have to figure this
> out. But as it is, PA-RISC is just about the shittiest RISC ever
> invented (with original sparc being a strong contender), and let's
> face it, nobody really uses it.  It's a "fun project", but it is not
> something that we should use to mess up either ACCESS_ONCE() or the
> MCS locks.
> 
> Just make PA-RISC use its own locks, not any of the new fancy ones.
> 
>   Linus

And what else do you want to do?

Peter Zijlstra said "I've been using xchg() and cmpxchg() without such 
consideration for quite a while." - so it basically implies that the 
kernel is full of such races, mcs_spinlock is just the most visible one 
that crashes the kernel first.

It's not only parisc - tile32, arc, metag (maybe hexagon) are broken too, 
because they don't have cmpxchg in hardware.

We have atomic_t, atomic64_t, atomic_long_t that can be sanely used even 
on architectures without hardware cmpxchg - so I ask - why can't we have 
atomic_pointer_t with the same semantics? (pointer type conversion issues 
can be solved, as it is done in the PATCH v2)

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread James Bottomley

On Mon, 2014-06-02 at 22:05 +0200, Peter Zijlstra wrote:
> On Mon, Jun 02, 2014 at 01:33:34PM -0400, Waiman Long wrote:
> > On 06/02/2014 12:30 PM, Peter Zijlstra wrote:
> > >On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
> > >>I'm almost inclined to just exclude parisc from using opt spinning.
> > >>
> > >>That said, this patch still doesn't address the far more interesting
> > >>problem of actually finding these issues for these few weird archs.
> > >So why do these archs provide xchg() and cmpxchg() at all? Wouldn't it
> > >be much simpler if archs that cannot sanely do this, not provide these
> > >primitives at all?
> > 
> > I believe xchg() and cmpxchg() are used in quite a number of places within
> > the generic kernel code. So kernel compilation will fail if those APIs
> > aren't provided by an architecture.
> 
> Yep.. so this is going to be painful for a while. But given their
> (parisc, sparc32, metag-lock1) constraints, who knows how many of those
> uses are actually broken.
> 
> So the question is, do you prefer subtly broken code or hard compile
> fails? Me, I go for the compile fail.

The failure is only when a variable that will have an atomic exchange
done on it is updated by a simple operation.  To do this properly, we'd
probably need an update macro we could supply the locking to, and a way
of marking the variable to get the compiler to cause a build error if it
was ever updated improperly, but that's starting to look very similar to
Mikulas' proposal.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Linus Torvalds

On Mon, Jun 2, 2014 at 1:05 PM, Peter Zijlstra  wrote:
>
> So the question is, do you prefer subtly broken code or hard compile
> fails? Me, I go for the compile fail.

The thing is, parisc has a perfectly fine "cmpxchg" implementation in
practice, and ACCESS_ONCE() and friends work fine too for reading.

What the "use a spinlock" approach cannot generally do is:

 - ACCESS_ONCE() to _write_ things doesn't work well. You really
should use "atomic_set()".

 - you may not necessarily be able to mix partial updates (ie
differently sized updates to the same thing) depending on just how the
spinlock hashing works

but both of those are really rare issues and don't affect normal code.

I would not necessarily be opposed to splitting up ACCESS_ONCE() for
reading and for writing, and maybe we could do something special for
the writing path (which tends to be less ctitical). It's really mixing
"ACCESS_ONCE(x)" to _set_ a value, together with atomic ops to update
it, that ends up being problematic.

Maybe there are other issues I can't think of right now. But
basically, parisc _can_ do cmpxchg, it's just that the code needs to
be somewhat sanitized.

Side note: some of the RCU code uses "ACCESS_ONCE()" for
read-modify-write code, which is just f*cking crazy. The semantics are
dubious, and it generally makes gcc create bad code too.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 01:33:34PM -0400, Waiman Long wrote:
> On 06/02/2014 12:30 PM, Peter Zijlstra wrote:
> >On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
> >>I'm almost inclined to just exclude parisc from using opt spinning.
> >>
> >>That said, this patch still doesn't address the far more interesting
> >>problem of actually finding these issues for these few weird archs.
> >So why do these archs provide xchg() and cmpxchg() at all? Wouldn't it
> >be much simpler if archs that cannot sanely do this, not provide these
> >primitives at all?
> 
> I believe xchg() and cmpxchg() are used in quite a number of places within
> the generic kernel code. So kernel compilation will fail if those APIs
> aren't provided by an architecture.

Yep.. so this is going to be painful for a while. But given their
(parisc, sparc32, metag-lock1) constraints, who knows how many of those
uses are actually broken.

So the question is, do you prefer subtly broken code or hard compile
fails? Me, I go for the compile fail.

In any case, this all goes towards what hpa said, what are the minimal
requirements we have for running Linux.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Waiman Long


On 06/02/2014 01:12 PM, Davidlohr Bueso wrote:

On Mon, 2014-06-02 at 10:09 -0700, Linus Torvalds wrote:

On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra  wrote:

And I can't say I'm a particular fan of these ops either, as alternative
I'm almost inclined to just exclude parisc from using opt spinning.

Please do.

I agree, this is the best way out of this mess. Furthermore, it would
also be nice to consolidate opt spinning in a common CONFIG option --
right now mutexes and rwsems create their own dependencies.



I would suggest adding a RWSEM_SPIN_ON_OWNER to control opt spinning in 
rwsem. Currently MUTEX_SPIN_ON_OWNER is doing that for mutex, and it is 
disabled when mutex debugging is turned on. So I think it is better to 
allow them to be disabled separately.


-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread H. Peter Anvin

On 06/02/2014 10:25 AM, Waiman Long wrote:
> 
> Doing an xchg is a very expensive operation compared with ACCESS_ONCE. I
> will not suggest doing that to make it right for PA-RISC at the expense
> of performance in other architectures.
> 

And of course, this gets into the toxic question: what are reasonable
minimum requirements for Linux?  How far do we need to stretch to
support niche architectures which have very small (Linux) userbases?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Waiman Long


On 06/02/2014 12:30 PM, Peter Zijlstra wrote:

On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:

I'm almost inclined to just exclude parisc from using opt spinning.

That said, this patch still doesn't address the far more interesting
problem of actually finding these issues for these few weird archs.

So why do these archs provide xchg() and cmpxchg() at all? Wouldn't it
be much simpler if archs that cannot sanely do this, not provide these
primitives at all?


I believe xchg() and cmpxchg() are used in quite a number of places 
within the generic kernel code. So kernel compilation will fail if those 
APIs aren't provided by an architecture.


-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Waiman Long


On 06/02/2014 12:43 PM, Paul E. McKenney wrote:

On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:

On Mon, Jun 02, 2014 at 12:00:45PM -0400, Mikulas Patocka wrote:

  struct optimistic_spin_queue {
-   struct optimistic_spin_queue *next, *prev;
+   atomic_pointer(struct optimistic_spin_queue *) next;
+   struct optimistic_spin_queue *prev;
int locked; /* 1 if lock acquired */
  };

Index: linux-3.15-rc8/include/asm-generic/atomic-long.h
===
--- linux-3.15-rc8.orig/include/asm-generic/atomic-long.h   2014-06-02 
17:11:17.0 +0200
+++ linux-3.15-rc8/include/asm-generic/atomic-long.h2014-06-02 
17:11:50.0 +0200
@@ -255,4 +255,31 @@ static inline long atomic_long_add_unles

  #endif  /*  BITS_PER_LONG == 64  */

+#define atomic_pointer(type)   \
+union {
\
+   atomic_long_t __a;  \
+   type __t;   \
+   char __check_sizeof[sizeof(type) == sizeof(long) ? 1 : -1]; \
+}

That's still entirely disgusting, and afaict entirely redundant. You can
do that test in the operators below just fine.


+#define ATOMIC_POINTER_INIT(i) { .__t = (i) }
+
+#define atomic_pointer_read(v) ((typeof((v)->__t))atomic_long_read(&(v)->__a))
+
+#define atomic_pointer_set(v, i)   ({  \
+   typeof((v)->__t) __i = (i);  \
+   atomic_long_set(&(v)->__a, (long)(__i)); \
+})
+
+#define atomic_pointer_xchg(v, i)  ({  \
+   typeof((v)->__t) __i = (i);  \
+   (typeof((v)->__t))atomic_long_xchg(&(v)->__a, (long)(__i));   \
+})
+
+#define atomic_pointer_cmpxchg(v, old, new)({  \
+   typeof((v)->__t) __old = (old);  \
+   typeof((v)->__t) __new = (new);  \
+   (typeof((v)->__t))atomic_long_cmpxchg(&(v)->__a, (long)(__old), 
(long)(__new));\
+})

And I can't say I'm a particular fan of these ops either, as alternative
I'm almost inclined to just exclude parisc from using opt spinning.

That is an excellent point for this particular issue.  Do parisc systems
really support enough CPUs to make queued spinlocks worthwhile?  If not,
maybe we should just have parisc stick with traditional spinlocks.


The operation in question is the optimistic spinning code of mutex which 
is currently active, I think, for all architectures. It is not related 
to the queued spinlock, though it will have the same problem.


Yes, by disabling the MUTEX_SPIN_ON_OWNER config variable from PA-RISC, 
we can disable optimistic spinning.


-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Waiman Long


On 06/02/2014 12:50 PM, Jason Low wrote:

On Mon, 2014-06-02 at 12:00 -0400, Mikulas Patocka wrote:

If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
so, in this case, cmpxchg or xchg isn't really atomic at all.

So if the problem is using ACCESS_ONCE writes with cmpxchg and xchg at
the same time, would the below change address this problem?

-
diff --git a/kernel/locking/mcs_spinlock.c b/kernel/locking/mcs_spinlock.c
index 838dc9e..8396721 100644
--- a/kernel/locking/mcs_spinlock.c
+++ b/kernel/locking/mcs_spinlock.c
@@ -71,7 +71,7 @@ bool osq_lock(struct optimistic_spin_queue **lock)
if (likely(prev == NULL))
return true;

-   ACCESS_ONCE(prev->next) = node;
+   xchg(>next, node);

/*
 * Normally @prev is untouchable after the above store; because at that
@@ -144,7 +144,7 @@ unqueue:
 */

ACCESS_ONCE(next->prev) = prev;
-   ACCESS_ONCE(prev->next) = next;
+   xchg(>next, next);

return false;
  }




Doing an xchg is a very expensive operation compared with ACCESS_ONCE. I 
will not suggest doing that to make it right for PA-RISC at the expense 
of performance in other architectures.


-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread James Bottomley

On Mon, 2014-06-02 at 09:43 -0700, Paul E. McKenney wrote:
> On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 02, 2014 at 12:00:45PM -0400, Mikulas Patocka wrote:
> > >  struct optimistic_spin_queue {
> > > - struct optimistic_spin_queue *next, *prev;
> > > + atomic_pointer(struct optimistic_spin_queue *) next;
> > > + struct optimistic_spin_queue *prev;
> > >   int locked; /* 1 if lock acquired */
> > >  };
> > >  
> > > Index: linux-3.15-rc8/include/asm-generic/atomic-long.h
> > > ===
> > > --- linux-3.15-rc8.orig/include/asm-generic/atomic-long.h 2014-06-02 
> > > 17:11:17.0 +0200
> > > +++ linux-3.15-rc8/include/asm-generic/atomic-long.h  2014-06-02 
> > > 17:11:50.0 +0200
> > > @@ -255,4 +255,31 @@ static inline long atomic_long_add_unles
> > >  
> > >  #endif  /*  BITS_PER_LONG == 64  */
> > >  
> > > +#define atomic_pointer(type) 
> > > \
> > > +union {  
> > > \
> > > + atomic_long_t __a;  \
> > > + type __t;   \
> > > + char __check_sizeof[sizeof(type) == sizeof(long) ? 1 : -1]; \
> > > +}
> > 
> > That's still entirely disgusting, and afaict entirely redundant. You can
> > do that test in the operators below just fine.
> > 
> > > +#define ATOMIC_POINTER_INIT(i)   { .__t = (i) }
> > > +
> > > +#define atomic_pointer_read(v)   
> > > ((typeof((v)->__t))atomic_long_read(&(v)->__a))
> > > +
> > > +#define atomic_pointer_set(v, i) ({  \
> > > + typeof((v)->__t) __i = (i); \
> > > + atomic_long_set(&(v)->__a, (long)(__i));\
> > > +})
> > > +
> > > +#define atomic_pointer_xchg(v, i)({  
> > > \
> > > + typeof((v)->__t) __i = (i); \
> > > + (typeof((v)->__t))atomic_long_xchg(&(v)->__a, (long)(__i)); \
> > > +})
> > > +
> > > +#define atomic_pointer_cmpxchg(v, old, new)  ({  
> > > \
> > > + typeof((v)->__t) __old = (old); \
> > > + typeof((v)->__t) __new = (new); \
> > > + (typeof((v)->__t))atomic_long_cmpxchg(&(v)->__a, (long)(__old), 
> > > (long)(__new));\
> > > +})
> > 
> > And I can't say I'm a particular fan of these ops either, as alternative
> > I'm almost inclined to just exclude parisc from using opt spinning.
> 
> That is an excellent point for this particular issue.  Do parisc systems
> really support enough CPUs to make queued spinlocks worthwhile?  If not,
> maybe we should just have parisc stick with traditional spinlocks.

Yes and No.  No for Linux because the only hyper CPU system is the
superdome, which we've never managed to boot linux on (it has some
complexities in the Bus architecture) and we're not likely to try
because the installations tend to cost north of US$1m.  For the Server
systems we do have a few high CPU count ones, but we lost access to them
when HP dismantled the parisc linux lab.  Currently the standard is
about 4 cpus.

I think just not using queued spinlocks is fine for us.  Should anyone
ever try the large CPU systems, we can revisit.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Davidlohr Bueso

On Mon, 2014-06-02 at 10:09 -0700, Linus Torvalds wrote:
> On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra  wrote:
> >
> > And I can't say I'm a particular fan of these ops either, as alternative
> > I'm almost inclined to just exclude parisc from using opt spinning.
> 
> Please do.

I agree, this is the best way out of this mess. Furthermore, it would
also be nice to consolidate opt spinning in a common CONFIG option --
right now mutexes and rwsems create their own dependencies.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Linus Torvalds

On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra  wrote:
>
> And I can't say I'm a particular fan of these ops either, as alternative
> I'm almost inclined to just exclude parisc from using opt spinning.

Please do.

There is no way in hell that we should introduce a magic new
atomic_pointer thing for parisc. And the idea somebody had to change
ACCESS_ONCE() for PA-RISC (I'm not going to go back to find who to
blame) is just horribly wrong too, since it's not even necessary for
any normal use: the special "load-and-store-zero" instruction isn't
actually used for "real" data, it's used only for the special
spinlocks afaik, so doing it for all ACCESS_ONCE() users would be
wrong even on PA-RISC. For any normal data, the usual "just load the
value, making sure the compiler doesn't reload it" is perfectly fine -
even on PA-RISC.

Now, if PA-RISC was a major architecture, we'd have to figure this
out. But as it is, PA-RISC is just about the shittiest RISC ever
invented (with original sparc being a strong contender), and let's
face it, nobody really uses it.  It's a "fun project", but it is not
something that we should use to mess up either ACCESS_ONCE() or the
MCS locks.

Just make PA-RISC use its own locks, not any of the new fancy ones.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 09:50:10AM -0700, Jason Low wrote:
> On Mon, 2014-06-02 at 12:00 -0400, Mikulas Patocka wrote:
> > If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
> > the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
> > so, in this case, cmpxchg or xchg isn't really atomic at all.
> 
> So if the problem is using ACCESS_ONCE writes with cmpxchg and xchg at
> the same time, would the below change address this problem?

And one could use cmpxchg() or atomic_add_return(..., 0) to read a value
out.  Probably at the cost of some performance impact, though.

Thanx, Paul

> -
> diff --git a/kernel/locking/mcs_spinlock.c b/kernel/locking/mcs_spinlock.c
> index 838dc9e..8396721 100644
> --- a/kernel/locking/mcs_spinlock.c
> +++ b/kernel/locking/mcs_spinlock.c
> @@ -71,7 +71,7 @@ bool osq_lock(struct optimistic_spin_queue **lock)
>   if (likely(prev == NULL))
>   return true;
> 
> - ACCESS_ONCE(prev->next) = node;
> + xchg(>next, node);
> 
>   /*
>* Normally @prev is untouchable after the above store; because at that
> @@ -144,7 +144,7 @@ unqueue:
>*/
> 
>   ACCESS_ONCE(next->prev) = prev;
> - ACCESS_ONCE(prev->next) = next;
> + xchg(>next, next);
> 
>   return false;
>  }
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Jason Low

On Mon, 2014-06-02 at 12:00 -0400, Mikulas Patocka wrote:
> If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
> the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
> so, in this case, cmpxchg or xchg isn't really atomic at all.

So if the problem is using ACCESS_ONCE writes with cmpxchg and xchg at
the same time, would the below change address this problem?

-
diff --git a/kernel/locking/mcs_spinlock.c b/kernel/locking/mcs_spinlock.c
index 838dc9e..8396721 100644
--- a/kernel/locking/mcs_spinlock.c
+++ b/kernel/locking/mcs_spinlock.c
@@ -71,7 +71,7 @@ bool osq_lock(struct optimistic_spin_queue **lock)
if (likely(prev == NULL))
return true;
 
-   ACCESS_ONCE(prev->next) = node;
+   xchg(>next, node);
 
/*
 * Normally @prev is untouchable after the above store; because at that
@@ -144,7 +144,7 @@ unqueue:
 */
 
ACCESS_ONCE(next->prev) = prev;
-   ACCESS_ONCE(prev->next) = next;
+   xchg(>next, next);
 
return false;
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 06:30:32PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
> > I'm almost inclined to just exclude parisc from using opt spinning.
> > 
> > That said, this patch still doesn't address the far more interesting
> > problem of actually finding these issues for these few weird archs.
> 
> So why do these archs provide xchg() and cmpxchg() at all? Wouldn't it
> be much simpler if archs that cannot sanely do this, not provide these
> primitives at all?

Such architectures would also need to avoid NO_HZ_FULL_SYSIDLE and
RCU_NOCB_CPU, but those are probably entirely reasonable restrictions.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 02, 2014 at 12:00:45PM -0400, Mikulas Patocka wrote:
> >  struct optimistic_spin_queue {
> > -   struct optimistic_spin_queue *next, *prev;
> > +   atomic_pointer(struct optimistic_spin_queue *) next;
> > +   struct optimistic_spin_queue *prev;
> > int locked; /* 1 if lock acquired */
> >  };
> >  
> > Index: linux-3.15-rc8/include/asm-generic/atomic-long.h
> > ===
> > --- linux-3.15-rc8.orig/include/asm-generic/atomic-long.h   2014-06-02 
> > 17:11:17.0 +0200
> > +++ linux-3.15-rc8/include/asm-generic/atomic-long.h2014-06-02 
> > 17:11:50.0 +0200
> > @@ -255,4 +255,31 @@ static inline long atomic_long_add_unles
> >  
> >  #endif  /*  BITS_PER_LONG == 64  */
> >  
> > +#define atomic_pointer(type)   
> > \
> > +union {
> > \
> > +   atomic_long_t __a;  \
> > +   type __t;   \
> > +   char __check_sizeof[sizeof(type) == sizeof(long) ? 1 : -1]; \
> > +}
> 
> That's still entirely disgusting, and afaict entirely redundant. You can
> do that test in the operators below just fine.
> 
> > +#define ATOMIC_POINTER_INIT(i) { .__t = (i) }
> > +
> > +#define atomic_pointer_read(v) 
> > ((typeof((v)->__t))atomic_long_read(&(v)->__a))
> > +
> > +#define atomic_pointer_set(v, i)   ({  \
> > +   typeof((v)->__t) __i = (i); \
> > +   atomic_long_set(&(v)->__a, (long)(__i));\
> > +})
> > +
> > +#define atomic_pointer_xchg(v, i)  ({  \
> > +   typeof((v)->__t) __i = (i); \
> > +   (typeof((v)->__t))atomic_long_xchg(&(v)->__a, (long)(__i)); \
> > +})
> > +
> > +#define atomic_pointer_cmpxchg(v, old, new)({  
> > \
> > +   typeof((v)->__t) __old = (old); \
> > +   typeof((v)->__t) __new = (new); \
> > +   (typeof((v)->__t))atomic_long_cmpxchg(&(v)->__a, (long)(__old), 
> > (long)(__new));\
> > +})
> 
> And I can't say I'm a particular fan of these ops either, as alternative
> I'm almost inclined to just exclude parisc from using opt spinning.

That is an excellent point for this particular issue.  Do parisc systems
really support enough CPUs to make queued spinlocks worthwhile?  If not,
maybe we should just have parisc stick with traditional spinlocks.

> That said, this patch still doesn't address the far more interesting
> problem of actually finding these issues for these few weird archs.

Indeed.  And finding other lower-probability failures due to other
atomic manipulations of pointers that are also accessed with normal
loads and stores.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
> I'm almost inclined to just exclude parisc from using opt spinning.
> 
> That said, this patch still doesn't address the far more interesting
> problem of actually finding these issues for these few weird archs.

So why do these archs provide xchg() and cmpxchg() at all? Wouldn't it
be much simpler if archs that cannot sanely do this, not provide these
primitives at all?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 12:00:45PM -0400, Mikulas Patocka wrote:
>  struct optimistic_spin_queue {
> - struct optimistic_spin_queue *next, *prev;
> + atomic_pointer(struct optimistic_spin_queue *) next;
> + struct optimistic_spin_queue *prev;
>   int locked; /* 1 if lock acquired */
>  };
>  
> Index: linux-3.15-rc8/include/asm-generic/atomic-long.h
> ===
> --- linux-3.15-rc8.orig/include/asm-generic/atomic-long.h 2014-06-02 
> 17:11:17.0 +0200
> +++ linux-3.15-rc8/include/asm-generic/atomic-long.h  2014-06-02 
> 17:11:50.0 +0200
> @@ -255,4 +255,31 @@ static inline long atomic_long_add_unles
>  
>  #endif  /*  BITS_PER_LONG == 64  */
>  
> +#define atomic_pointer(type) \
> +union {  
> \
> + atomic_long_t __a;  \
> + type __t;   \
> + char __check_sizeof[sizeof(type) == sizeof(long) ? 1 : -1]; \
> +}

That's still entirely disgusting, and afaict entirely redundant. You can
do that test in the operators below just fine.

> +#define ATOMIC_POINTER_INIT(i)   { .__t = (i) }
> +
> +#define atomic_pointer_read(v)   
> ((typeof((v)->__t))atomic_long_read(&(v)->__a))
> +
> +#define atomic_pointer_set(v, i) ({  \
> + typeof((v)->__t) __i = (i); \
> + atomic_long_set(&(v)->__a, (long)(__i));\
> +})
> +
> +#define atomic_pointer_xchg(v, i)({  \
> + typeof((v)->__t) __i = (i); \
> + (typeof((v)->__t))atomic_long_xchg(&(v)->__a, (long)(__i)); \
> +})
> +
> +#define atomic_pointer_cmpxchg(v, old, new)  ({  \
> + typeof((v)->__t) __old = (old); \
> + typeof((v)->__t) __new = (new); \
> + (typeof((v)->__t))atomic_long_cmpxchg(&(v)->__a, (long)(__old), 
> (long)(__new));\
> +})

And I can't say I'm a particular fan of these ops either, as alternative
I'm almost inclined to just exclude parisc from using opt spinning.

That said, this patch still doesn't address the far more interesting
problem of actually finding these issues for these few weird archs.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Mikulas Patocka

The cancelable MCS spinlocks introduced in
fb0527bd5ea99bfeb2dd91e3c1433ecf745d6b99 break the kernel on PA-RISC.

How to reproduce:
* Use a machine with two dual-core PA-8900 processors.
* Run the LVM testsuite and compile the kernel in an endless loop at the
  same time.
* Wait for an hour or two and the kernel locks up.

You see some process locked up in osd_lock and osq_unlock:
INFO: rcu_sched self-detected stall on CPU { 2}  (t=18000 jiffies g=247335 
c=247334 q=101)
CPU: 2 PID: 21006 Comm: lvm Tainted: G   O  3.15.0-rc7 #9
Backtrace:
 [<4013e8a4>] show_stack+0x14/0x20
 [<403016f0>] dump_stack+0x88/0x100
 [<401b8738>] rcu_check_callbacks+0x4a8/0x900
 [<401714c4>] update_process_times+0x64/0xc0
 [<4013fa24>] timer_interrupt+0x19c/0x200
 [<401ad8d8>] handle_irq_event_percpu+0xa8/0x238
 [<401b2454>] handle_percpu_irq+0x9c/0xd0
 [<401acc40>] generic_handle_irq+0x40/0x50
 [<401408cc>] do_cpu_irq_mask+0x1ac/0x298
 [<4012c074>] intr_return+0x0/0xc
 [<401a609c>] osq_lock+0xc4/0x178
 [<40138d24>] __mutex_lock_slowpath+0x1cc/0x290
 [<40138e78>] mutex_lock+0x90/0x98
 [<402a5614>] kernfs_activate+0x6c/0x1a0
 [<402a59e0>] kernfs_add_one+0x140/0x190
 [<402a75ec>] __kernfs_create_file+0xa4/0xf8

INFO: rcu_sched self-detected stall on CPU { 3}  (t=18473 jiffies g=247335 
c=247334 q=101)
CPU: 3 PID: 21051 Comm: udevd Tainted: G   O  3.15.0-rc7 #9
Backtrace:
 [<4013e8a4>] show_stack+0x14/0x20
 [<403016f0>] dump_stack+0x88/0x100
 [<401b8738>] rcu_check_callbacks+0x4a8/0x900
 [<401714c4>] update_process_times+0x64/0xc0
 [<4013fa24>] timer_interrupt+0x19c/0x200
 [<401ad8d8>] handle_irq_event_percpu+0xa8/0x238
 [<401b2454>] handle_percpu_irq+0x9c/0xd0
 [<401acc40>] generic_handle_irq+0x40/0x50
 [<401408cc>] do_cpu_irq_mask+0x1ac/0x298
 [<4012c074>] intr_return+0x0/0xc
 [<401a6220>] osq_unlock+0xd0/0xf8
 [<40138dcc>] __mutex_lock_slowpath+0x274/0x290
 [<40138e78>] mutex_lock+0x90/0x98
 [<402a3a90>] kernfs_dop_revalidate+0x48/0x108
 [<40233310>] lookup_fast+0x320/0x348
 [<40234600>] link_path_walk+0x190/0x9d8


The code in kernel/locking/mcs_spinlock.c is broken.

PA-RISC doesn't have xchg or cmpxchg atomic instructions like other
processors. It only has ldcw and ldcd instructions that load a word (or
doubleword) from memory and atomically store zero at the same location.
These instructions can only be used to implement spinlocks, direct
implementation of other atomic operations is impossible.

Consequently, Linux xchg and cmpxchg functions are implemented in such a
way that they hash the address, use the hash to index a spinlock, take the
spinlock, perform the xchg or cmpxchg operation non-atomically and drop
the spinlock.

If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
so, in this case, cmpxchg or xchg isn't really atomic at all.

This patch fixes the bug by introducing a new type atomic_pointer and
replacing the offending pointer with it. atomic_pointer_set (calling
atomic_long_set) takes the hashed spinlock, so it avoids the race
condition. We perform some gcc-specific compiler tricks to warn on pointer
type mismatch.

Signed-off-by: Mikulas Patocka 

---
 include/asm-generic/atomic-long.h |   27 +++
 kernel/locking/mcs_spinlock.c |   16 
 kernel/locking/mcs_spinlock.h |4 +++-
 3 files changed, 38 insertions(+), 9 deletions(-)

Index: linux-3.15-rc8/kernel/locking/mcs_spinlock.c
===
--- linux-3.15-rc8.orig/kernel/locking/mcs_spinlock.c   2014-06-02 
17:11:16.0 +0200
+++ linux-3.15-rc8/kernel/locking/mcs_spinlock.c2014-06-02 
17:11:50.0 +0200
@@ -47,8 +47,8 @@ osq_wait_next(struct optimistic_spin_que
 * wait for either @lock to point to us, through its Step-B, or
 * wait for a new @node->next from its Step-C.
 */
-   if (node->next) {
-   next = xchg(>next, NULL);
+   if (atomic_pointer_read(>next)) {
+   next = atomic_pointer_xchg(>next, NULL);
if (next)
break;
}
@@ -65,13 +65,13 @@ bool osq_lock(struct optimistic_spin_que
struct optimistic_spin_queue *prev, *next;
 
node->locked = 0;
-   node->next = NULL;
+   atomic_pointer_set(>next, NULL);
 
node->prev = prev = xchg(lock, node);
if (likely(prev == NULL))
return true;
 
-   ACCESS_ONCE(prev->next) = node;
+   atomic_pointer_set(>next, node);
 
/*
 * Normally @prev is untouchable after the above store;

[PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Mikulas Patocka

The cancelable MCS spinlocks introduced in
fb0527bd5ea99bfeb2dd91e3c1433ecf745d6b99 break the kernel on PA-RISC.

How to reproduce:
* Use a machine with two dual-core PA-8900 processors.
* Run the LVM testsuite and compile the kernel in an endless loop at the
  same time.
* Wait for an hour or two and the kernel locks up.

You see some process locked up in osd_lock and osq_unlock:
INFO: rcu_sched self-detected stall on CPU { 2}  (t=18000 jiffies g=247335 
c=247334 q=101)
CPU: 2 PID: 21006 Comm: lvm Tainted: G   O  3.15.0-rc7 #9
Backtrace:
 [4013e8a4] show_stack+0x14/0x20
 [403016f0] dump_stack+0x88/0x100
 [401b8738] rcu_check_callbacks+0x4a8/0x900
 [401714c4] update_process_times+0x64/0xc0
 [4013fa24] timer_interrupt+0x19c/0x200
 [401ad8d8] handle_irq_event_percpu+0xa8/0x238
 [401b2454] handle_percpu_irq+0x9c/0xd0
 [401acc40] generic_handle_irq+0x40/0x50
 [401408cc] do_cpu_irq_mask+0x1ac/0x298
 [4012c074] intr_return+0x0/0xc
 [401a609c] osq_lock+0xc4/0x178
 [40138d24] __mutex_lock_slowpath+0x1cc/0x290
 [40138e78] mutex_lock+0x90/0x98
 [402a5614] kernfs_activate+0x6c/0x1a0
 [402a59e0] kernfs_add_one+0x140/0x190
 [402a75ec] __kernfs_create_file+0xa4/0xf8

INFO: rcu_sched self-detected stall on CPU { 3}  (t=18473 jiffies g=247335 
c=247334 q=101)
CPU: 3 PID: 21051 Comm: udevd Tainted: G   O  3.15.0-rc7 #9
Backtrace:
 [4013e8a4] show_stack+0x14/0x20
 [403016f0] dump_stack+0x88/0x100
 [401b8738] rcu_check_callbacks+0x4a8/0x900
 [401714c4] update_process_times+0x64/0xc0
 [4013fa24] timer_interrupt+0x19c/0x200
 [401ad8d8] handle_irq_event_percpu+0xa8/0x238
 [401b2454] handle_percpu_irq+0x9c/0xd0
 [401acc40] generic_handle_irq+0x40/0x50
 [401408cc] do_cpu_irq_mask+0x1ac/0x298
 [4012c074] intr_return+0x0/0xc
 [401a6220] osq_unlock+0xd0/0xf8
 [40138dcc] __mutex_lock_slowpath+0x274/0x290
 [40138e78] mutex_lock+0x90/0x98
 [402a3a90] kernfs_dop_revalidate+0x48/0x108
 [40233310] lookup_fast+0x320/0x348
 [40234600] link_path_walk+0x190/0x9d8


The code in kernel/locking/mcs_spinlock.c is broken.

PA-RISC doesn't have xchg or cmpxchg atomic instructions like other
processors. It only has ldcw and ldcd instructions that load a word (or
doubleword) from memory and atomically store zero at the same location.
These instructions can only be used to implement spinlocks, direct
implementation of other atomic operations is impossible.

Consequently, Linux xchg and cmpxchg functions are implemented in such a
way that they hash the address, use the hash to index a spinlock, take the
spinlock, perform the xchg or cmpxchg operation non-atomically and drop
the spinlock.

If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
so, in this case, cmpxchg or xchg isn't really atomic at all.

This patch fixes the bug by introducing a new type atomic_pointer and
replacing the offending pointer with it. atomic_pointer_set (calling
atomic_long_set) takes the hashed spinlock, so it avoids the race
condition. We perform some gcc-specific compiler tricks to warn on pointer
type mismatch.

Signed-off-by: Mikulas Patocka mpato...@redhat.com

---
 include/asm-generic/atomic-long.h |   27 +++
 kernel/locking/mcs_spinlock.c |   16 
 kernel/locking/mcs_spinlock.h |4 +++-
 3 files changed, 38 insertions(+), 9 deletions(-)

Index: linux-3.15-rc8/kernel/locking/mcs_spinlock.c
===
--- linux-3.15-rc8.orig/kernel/locking/mcs_spinlock.c   2014-06-02 
17:11:16.0 +0200
+++ linux-3.15-rc8/kernel/locking/mcs_spinlock.c2014-06-02 
17:11:50.0 +0200
@@ -47,8 +47,8 @@ osq_wait_next(struct optimistic_spin_que
 * wait for either @lock to point to us, through its Step-B, or
 * wait for a new @node-next from its Step-C.
 */
-   if (node-next) {
-   next = xchg(node-next, NULL);
+   if (atomic_pointer_read(node-next)) {
+   next = atomic_pointer_xchg(node-next, NULL);
if (next)
break;
}
@@ -65,13 +65,13 @@ bool osq_lock(struct optimistic_spin_que
struct optimistic_spin_queue *prev, *next;
 
node-locked = 0;
-   node-next = NULL;
+   atomic_pointer_set(node-next, NULL);
 
node-prev = prev = xchg(lock, node);
if (likely(prev == NULL))
return true;
 
-   ACCESS_ONCE(prev-next) = node;
+   atomic_pointer_set(prev-next, node);
 
/*
 * Normally @prev is untouchable after the above store; because at that
@@ -103,8 +103,8

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 12:00:45PM -0400, Mikulas Patocka wrote:
  struct optimistic_spin_queue {
 - struct optimistic_spin_queue *next, *prev;
 + atomic_pointer(struct optimistic_spin_queue *) next;
 + struct optimistic_spin_queue *prev;
   int locked; /* 1 if lock acquired */
  };
  
 Index: linux-3.15-rc8/include/asm-generic/atomic-long.h
 ===
 --- linux-3.15-rc8.orig/include/asm-generic/atomic-long.h 2014-06-02 
 17:11:17.0 +0200
 +++ linux-3.15-rc8/include/asm-generic/atomic-long.h  2014-06-02 
 17:11:50.0 +0200
 @@ -255,4 +255,31 @@ static inline long atomic_long_add_unles
  
  #endif  /*  BITS_PER_LONG == 64  */
  
 +#define atomic_pointer(type) \
 +union {  
 \
 + atomic_long_t __a;  \
 + type __t;   \
 + char __check_sizeof[sizeof(type) == sizeof(long) ? 1 : -1]; \
 +}

That's still entirely disgusting, and afaict entirely redundant. You can
do that test in the operators below just fine.

 +#define ATOMIC_POINTER_INIT(i)   { .__t = (i) }
 +
 +#define atomic_pointer_read(v)   
 ((typeof((v)-__t))atomic_long_read((v)-__a))
 +
 +#define atomic_pointer_set(v, i) ({  \
 + typeof((v)-__t) __i = (i); \
 + atomic_long_set((v)-__a, (long)(__i));\
 +})
 +
 +#define atomic_pointer_xchg(v, i)({  \
 + typeof((v)-__t) __i = (i); \
 + (typeof((v)-__t))atomic_long_xchg((v)-__a, (long)(__i)); \
 +})
 +
 +#define atomic_pointer_cmpxchg(v, old, new)  ({  \
 + typeof((v)-__t) __old = (old); \
 + typeof((v)-__t) __new = (new); \
 + (typeof((v)-__t))atomic_long_cmpxchg((v)-__a, (long)(__old), 
 (long)(__new));\
 +})

And I can't say I'm a particular fan of these ops either, as alternative
I'm almost inclined to just exclude parisc from using opt spinning.

That said, this patch still doesn't address the far more interesting
problem of actually finding these issues for these few weird archs.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
 I'm almost inclined to just exclude parisc from using opt spinning.
 
 That said, this patch still doesn't address the far more interesting
 problem of actually finding these issues for these few weird archs.

So why do these archs provide xchg() and cmpxchg() at all? Wouldn't it
be much simpler if archs that cannot sanely do this, not provide these
primitives at all?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
 On Mon, Jun 02, 2014 at 12:00:45PM -0400, Mikulas Patocka wrote:
   struct optimistic_spin_queue {
  -   struct optimistic_spin_queue *next, *prev;
  +   atomic_pointer(struct optimistic_spin_queue *) next;
  +   struct optimistic_spin_queue *prev;
  int locked; /* 1 if lock acquired */
   };
   
  Index: linux-3.15-rc8/include/asm-generic/atomic-long.h
  ===
  --- linux-3.15-rc8.orig/include/asm-generic/atomic-long.h   2014-06-02 
  17:11:17.0 +0200
  +++ linux-3.15-rc8/include/asm-generic/atomic-long.h2014-06-02 
  17:11:50.0 +0200
  @@ -255,4 +255,31 @@ static inline long atomic_long_add_unles
   
   #endif  /*  BITS_PER_LONG == 64  */
   
  +#define atomic_pointer(type)   
  \
  +union {
  \
  +   atomic_long_t __a;  \
  +   type __t;   \
  +   char __check_sizeof[sizeof(type) == sizeof(long) ? 1 : -1]; \
  +}
 
 That's still entirely disgusting, and afaict entirely redundant. You can
 do that test in the operators below just fine.
 
  +#define ATOMIC_POINTER_INIT(i) { .__t = (i) }
  +
  +#define atomic_pointer_read(v) 
  ((typeof((v)-__t))atomic_long_read((v)-__a))
  +
  +#define atomic_pointer_set(v, i)   ({  \
  +   typeof((v)-__t) __i = (i); \
  +   atomic_long_set((v)-__a, (long)(__i));\
  +})
  +
  +#define atomic_pointer_xchg(v, i)  ({  \
  +   typeof((v)-__t) __i = (i); \
  +   (typeof((v)-__t))atomic_long_xchg((v)-__a, (long)(__i)); \
  +})
  +
  +#define atomic_pointer_cmpxchg(v, old, new)({  
  \
  +   typeof((v)-__t) __old = (old); \
  +   typeof((v)-__t) __new = (new); \
  +   (typeof((v)-__t))atomic_long_cmpxchg((v)-__a, (long)(__old), 
  (long)(__new));\
  +})
 
 And I can't say I'm a particular fan of these ops either, as alternative
 I'm almost inclined to just exclude parisc from using opt spinning.

That is an excellent point for this particular issue.  Do parisc systems
really support enough CPUs to make queued spinlocks worthwhile?  If not,
maybe we should just have parisc stick with traditional spinlocks.

 That said, this patch still doesn't address the far more interesting
 problem of actually finding these issues for these few weird archs.

Indeed.  And finding other lower-probability failures due to other
atomic manipulations of pointers that are also accessed with normal
loads and stores.

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 06:30:32PM +0200, Peter Zijlstra wrote:
 On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
  I'm almost inclined to just exclude parisc from using opt spinning.
  
  That said, this patch still doesn't address the far more interesting
  problem of actually finding these issues for these few weird archs.
 
 So why do these archs provide xchg() and cmpxchg() at all? Wouldn't it
 be much simpler if archs that cannot sanely do this, not provide these
 primitives at all?

Such architectures would also need to avoid NO_HZ_FULL_SYSIDLE and
RCU_NOCB_CPU, but those are probably entirely reasonable restrictions.

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Jason Low

On Mon, 2014-06-02 at 12:00 -0400, Mikulas Patocka wrote:
 If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
 the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
 so, in this case, cmpxchg or xchg isn't really atomic at all.

So if the problem is using ACCESS_ONCE writes with cmpxchg and xchg at
the same time, would the below change address this problem?

-
diff --git a/kernel/locking/mcs_spinlock.c b/kernel/locking/mcs_spinlock.c
index 838dc9e..8396721 100644
--- a/kernel/locking/mcs_spinlock.c
+++ b/kernel/locking/mcs_spinlock.c
@@ -71,7 +71,7 @@ bool osq_lock(struct optimistic_spin_queue **lock)
if (likely(prev == NULL))
return true;
 
-   ACCESS_ONCE(prev-next) = node;
+   xchg(prev-next, node);
 
/*
 * Normally @prev is untouchable after the above store; because at that
@@ -144,7 +144,7 @@ unqueue:
 */
 
ACCESS_ONCE(next-prev) = prev;
-   ACCESS_ONCE(prev-next) = next;
+   xchg(prev-next, next);
 
return false;
 }


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 09:50:10AM -0700, Jason Low wrote:
 On Mon, 2014-06-02 at 12:00 -0400, Mikulas Patocka wrote:
  If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
  the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
  so, in this case, cmpxchg or xchg isn't really atomic at all.
 
 So if the problem is using ACCESS_ONCE writes with cmpxchg and xchg at
 the same time, would the below change address this problem?

And one could use cmpxchg() or atomic_add_return(..., 0) to read a value
out.  Probably at the cost of some performance impact, though.

Thanx, Paul

 -
 diff --git a/kernel/locking/mcs_spinlock.c b/kernel/locking/mcs_spinlock.c
 index 838dc9e..8396721 100644
 --- a/kernel/locking/mcs_spinlock.c
 +++ b/kernel/locking/mcs_spinlock.c
 @@ -71,7 +71,7 @@ bool osq_lock(struct optimistic_spin_queue **lock)
   if (likely(prev == NULL))
   return true;
 
 - ACCESS_ONCE(prev-next) = node;
 + xchg(prev-next, node);
 
   /*
* Normally @prev is untouchable after the above store; because at that
 @@ -144,7 +144,7 @@ unqueue:
*/
 
   ACCESS_ONCE(next-prev) = prev;
 - ACCESS_ONCE(prev-next) = next;
 + xchg(prev-next, next);
 
   return false;
  }
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Linus Torvalds

On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra pet...@infradead.org wrote:

 And I can't say I'm a particular fan of these ops either, as alternative
 I'm almost inclined to just exclude parisc from using opt spinning.

Please do.

There is no way in hell that we should introduce a magic new
atomic_pointer thing for parisc. And the idea somebody had to change
ACCESS_ONCE() for PA-RISC (I'm not going to go back to find who to
blame) is just horribly wrong too, since it's not even necessary for
any normal use: the special load-and-store-zero instruction isn't
actually used for real data, it's used only for the special
spinlocks afaik, so doing it for all ACCESS_ONCE() users would be
wrong even on PA-RISC. For any normal data, the usual just load the
value, making sure the compiler doesn't reload it is perfectly fine -
even on PA-RISC.

Now, if PA-RISC was a major architecture, we'd have to figure this
out. But as it is, PA-RISC is just about the shittiest RISC ever
invented (with original sparc being a strong contender), and let's
face it, nobody really uses it.  It's a fun project, but it is not
something that we should use to mess up either ACCESS_ONCE() or the
MCS locks.

Just make PA-RISC use its own locks, not any of the new fancy ones.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Davidlohr Bueso

On Mon, 2014-06-02 at 10:09 -0700, Linus Torvalds wrote:
 On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra pet...@infradead.org wrote:
 
  And I can't say I'm a particular fan of these ops either, as alternative
  I'm almost inclined to just exclude parisc from using opt spinning.
 
 Please do.

I agree, this is the best way out of this mess. Furthermore, it would
also be nice to consolidate opt spinning in a common CONFIG option --
right now mutexes and rwsems create their own dependencies.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread James Bottomley

On Mon, 2014-06-02 at 09:43 -0700, Paul E. McKenney wrote:
 On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
  On Mon, Jun 02, 2014 at 12:00:45PM -0400, Mikulas Patocka wrote:
struct optimistic_spin_queue {
   - struct optimistic_spin_queue *next, *prev;
   + atomic_pointer(struct optimistic_spin_queue *) next;
   + struct optimistic_spin_queue *prev;
 int locked; /* 1 if lock acquired */
};

   Index: linux-3.15-rc8/include/asm-generic/atomic-long.h
   ===
   --- linux-3.15-rc8.orig/include/asm-generic/atomic-long.h 2014-06-02 
   17:11:17.0 +0200
   +++ linux-3.15-rc8/include/asm-generic/atomic-long.h  2014-06-02 
   17:11:50.0 +0200
   @@ -255,4 +255,31 @@ static inline long atomic_long_add_unles

#endif  /*  BITS_PER_LONG == 64  */

   +#define atomic_pointer(type) 
   \
   +union {  
   \
   + atomic_long_t __a;  \
   + type __t;   \
   + char __check_sizeof[sizeof(type) == sizeof(long) ? 1 : -1]; \
   +}
  
  That's still entirely disgusting, and afaict entirely redundant. You can
  do that test in the operators below just fine.
  
   +#define ATOMIC_POINTER_INIT(i)   { .__t = (i) }
   +
   +#define atomic_pointer_read(v)   
   ((typeof((v)-__t))atomic_long_read((v)-__a))
   +
   +#define atomic_pointer_set(v, i) ({  \
   + typeof((v)-__t) __i = (i); \
   + atomic_long_set((v)-__a, (long)(__i));\
   +})
   +
   +#define atomic_pointer_xchg(v, i)({  
   \
   + typeof((v)-__t) __i = (i); \
   + (typeof((v)-__t))atomic_long_xchg((v)-__a, (long)(__i)); \
   +})
   +
   +#define atomic_pointer_cmpxchg(v, old, new)  ({  
   \
   + typeof((v)-__t) __old = (old); \
   + typeof((v)-__t) __new = (new); \
   + (typeof((v)-__t))atomic_long_cmpxchg((v)-__a, (long)(__old), 
   (long)(__new));\
   +})
  
  And I can't say I'm a particular fan of these ops either, as alternative
  I'm almost inclined to just exclude parisc from using opt spinning.
 
 That is an excellent point for this particular issue.  Do parisc systems
 really support enough CPUs to make queued spinlocks worthwhile?  If not,
 maybe we should just have parisc stick with traditional spinlocks.

Yes and No.  No for Linux because the only hyper CPU system is the
superdome, which we've never managed to boot linux on (it has some
complexities in the Bus architecture) and we're not likely to try
because the installations tend to cost north of US$1m.  For the Server
systems we do have a few high CPU count ones, but we lost access to them
when HP dismantled the parisc linux lab.  Currently the standard is
about 4 cpus.

I think just not using queued spinlocks is fine for us.  Should anyone
ever try the large CPU systems, we can revisit.

James


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Waiman Long


On 06/02/2014 12:50 PM, Jason Low wrote:

On Mon, 2014-06-02 at 12:00 -0400, Mikulas Patocka wrote:

If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
so, in this case, cmpxchg or xchg isn't really atomic at all.

So if the problem is using ACCESS_ONCE writes with cmpxchg and xchg at
the same time, would the below change address this problem?

-
diff --git a/kernel/locking/mcs_spinlock.c b/kernel/locking/mcs_spinlock.c
index 838dc9e..8396721 100644
--- a/kernel/locking/mcs_spinlock.c
+++ b/kernel/locking/mcs_spinlock.c
@@ -71,7 +71,7 @@ bool osq_lock(struct optimistic_spin_queue **lock)
if (likely(prev == NULL))
return true;

-   ACCESS_ONCE(prev-next) = node;
+   xchg(prev-next, node);

/*
 * Normally @prev is untouchable after the above store; because at that
@@ -144,7 +144,7 @@ unqueue:
 */

ACCESS_ONCE(next-prev) = prev;
-   ACCESS_ONCE(prev-next) = next;
+   xchg(prev-next, next);

return false;
  }




Doing an xchg is a very expensive operation compared with ACCESS_ONCE. I 
will not suggest doing that to make it right for PA-RISC at the expense 
of performance in other architectures.


-Longman
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Waiman Long


On 06/02/2014 12:43 PM, Paul E. McKenney wrote:

On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:

On Mon, Jun 02, 2014 at 12:00:45PM -0400, Mikulas Patocka wrote:

  struct optimistic_spin_queue {
-   struct optimistic_spin_queue *next, *prev;
+   atomic_pointer(struct optimistic_spin_queue *) next;
+   struct optimistic_spin_queue *prev;
int locked; /* 1 if lock acquired */
  };

Index: linux-3.15-rc8/include/asm-generic/atomic-long.h
===
--- linux-3.15-rc8.orig/include/asm-generic/atomic-long.h   2014-06-02 
17:11:17.0 +0200
+++ linux-3.15-rc8/include/asm-generic/atomic-long.h2014-06-02 
17:11:50.0 +0200
@@ -255,4 +255,31 @@ static inline long atomic_long_add_unles

  #endif  /*  BITS_PER_LONG == 64  */

+#define atomic_pointer(type)   \
+union {
\
+   atomic_long_t __a;  \
+   type __t;   \
+   char __check_sizeof[sizeof(type) == sizeof(long) ? 1 : -1]; \
+}

That's still entirely disgusting, and afaict entirely redundant. You can
do that test in the operators below just fine.


+#define ATOMIC_POINTER_INIT(i) { .__t = (i) }
+
+#define atomic_pointer_read(v) ((typeof((v)-__t))atomic_long_read((v)-__a))
+
+#define atomic_pointer_set(v, i)   ({  \
+   typeof((v)-__t) __i = (i);  \
+   atomic_long_set((v)-__a, (long)(__i)); \
+})
+
+#define atomic_pointer_xchg(v, i)  ({  \
+   typeof((v)-__t) __i = (i);  \
+   (typeof((v)-__t))atomic_long_xchg((v)-__a, (long)(__i));   \
+})
+
+#define atomic_pointer_cmpxchg(v, old, new)({  \
+   typeof((v)-__t) __old = (old);  \
+   typeof((v)-__t) __new = (new);  \
+   (typeof((v)-__t))atomic_long_cmpxchg((v)-__a, (long)(__old), 
(long)(__new));\
+})

And I can't say I'm a particular fan of these ops either, as alternative
I'm almost inclined to just exclude parisc from using opt spinning.

That is an excellent point for this particular issue.  Do parisc systems
really support enough CPUs to make queued spinlocks worthwhile?  If not,
maybe we should just have parisc stick with traditional spinlocks.


The operation in question is the optimistic spinning code of mutex which 
is currently active, I think, for all architectures. It is not related 
to the queued spinlock, though it will have the same problem.


Yes, by disabling the MUTEX_SPIN_ON_OWNER config variable from PA-RISC, 
we can disable optimistic spinning.


-Longman
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Waiman Long


On 06/02/2014 12:30 PM, Peter Zijlstra wrote:

On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:

I'm almost inclined to just exclude parisc from using opt spinning.

That said, this patch still doesn't address the far more interesting
problem of actually finding these issues for these few weird archs.

So why do these archs provide xchg() and cmpxchg() at all? Wouldn't it
be much simpler if archs that cannot sanely do this, not provide these
primitives at all?


I believe xchg() and cmpxchg() are used in quite a number of places 
within the generic kernel code. So kernel compilation will fail if those 
APIs aren't provided by an architecture.


-Longman
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread H. Peter Anvin

On 06/02/2014 10:25 AM, Waiman Long wrote:
 
 Doing an xchg is a very expensive operation compared with ACCESS_ONCE. I
 will not suggest doing that to make it right for PA-RISC at the expense
 of performance in other architectures.
 

And of course, this gets into the toxic question: what are reasonable
minimum requirements for Linux?  How far do we need to stretch to
support niche architectures which have very small (Linux) userbases?

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Waiman Long


On 06/02/2014 01:12 PM, Davidlohr Bueso wrote:

On Mon, 2014-06-02 at 10:09 -0700, Linus Torvalds wrote:

On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstrapet...@infradead.org  wrote:

And I can't say I'm a particular fan of these ops either, as alternative
I'm almost inclined to just exclude parisc from using opt spinning.

Please do.

I agree, this is the best way out of this mess. Furthermore, it would
also be nice to consolidate opt spinning in a common CONFIG option --
right now mutexes and rwsems create their own dependencies.



I would suggest adding a RWSEM_SPIN_ON_OWNER to control opt spinning in 
rwsem. Currently MUTEX_SPIN_ON_OWNER is doing that for mutex, and it is 
disabled when mutex debugging is turned on. So I think it is better to 
allow them to be disabled separately.


-Longman
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Peter Zijlstra

On Mon, Jun 02, 2014 at 01:33:34PM -0400, Waiman Long wrote:
 On 06/02/2014 12:30 PM, Peter Zijlstra wrote:
 On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
 I'm almost inclined to just exclude parisc from using opt spinning.
 
 That said, this patch still doesn't address the far more interesting
 problem of actually finding these issues for these few weird archs.
 So why do these archs provide xchg() and cmpxchg() at all? Wouldn't it
 be much simpler if archs that cannot sanely do this, not provide these
 primitives at all?
 
 I believe xchg() and cmpxchg() are used in quite a number of places within
 the generic kernel code. So kernel compilation will fail if those APIs
 aren't provided by an architecture.

Yep.. so this is going to be painful for a while. But given their
(parisc, sparc32, metag-lock1) constraints, who knows how many of those
uses are actually broken.

So the question is, do you prefer subtly broken code or hard compile
fails? Me, I go for the compile fail.

In any case, this all goes towards what hpa said, what are the minimal
requirements we have for running Linux.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Linus Torvalds

On Mon, Jun 2, 2014 at 1:05 PM, Peter Zijlstra pet...@infradead.org wrote:

 So the question is, do you prefer subtly broken code or hard compile
 fails? Me, I go for the compile fail.

The thing is, parisc has a perfectly fine cmpxchg implementation in
practice, and ACCESS_ONCE() and friends work fine too for reading.

What the use a spinlock approach cannot generally do is:

 - ACCESS_ONCE() to _write_ things doesn't work well. You really
should use atomic_set().

 - you may not necessarily be able to mix partial updates (ie
differently sized updates to the same thing) depending on just how the
spinlock hashing works

but both of those are really rare issues and don't affect normal code.

I would not necessarily be opposed to splitting up ACCESS_ONCE() for
reading and for writing, and maybe we could do something special for
the writing path (which tends to be less ctitical). It's really mixing
ACCESS_ONCE(x) to _set_ a value, together with atomic ops to update
it, that ends up being problematic.

Maybe there are other issues I can't think of right now. But
basically, parisc _can_ do cmpxchg, it's just that the code needs to
be somewhat sanitized.

Side note: some of the RCU code uses ACCESS_ONCE() for
read-modify-write code, which is just f*cking crazy. The semantics are
dubious, and it generally makes gcc create bad code too.

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread James Bottomley

On Mon, 2014-06-02 at 22:05 +0200, Peter Zijlstra wrote:
 On Mon, Jun 02, 2014 at 01:33:34PM -0400, Waiman Long wrote:
  On 06/02/2014 12:30 PM, Peter Zijlstra wrote:
  On Mon, Jun 02, 2014 at 06:25:25PM +0200, Peter Zijlstra wrote:
  I'm almost inclined to just exclude parisc from using opt spinning.
  
  That said, this patch still doesn't address the far more interesting
  problem of actually finding these issues for these few weird archs.
  So why do these archs provide xchg() and cmpxchg() at all? Wouldn't it
  be much simpler if archs that cannot sanely do this, not provide these
  primitives at all?
  
  I believe xchg() and cmpxchg() are used in quite a number of places within
  the generic kernel code. So kernel compilation will fail if those APIs
  aren't provided by an architecture.
 
 Yep.. so this is going to be painful for a while. But given their
 (parisc, sparc32, metag-lock1) constraints, who knows how many of those
 uses are actually broken.
 
 So the question is, do you prefer subtly broken code or hard compile
 fails? Me, I go for the compile fail.

The failure is only when a variable that will have an atomic exchange
done on it is updated by a simple operation.  To do this properly, we'd
probably need an update macro we could supply the locking to, and a way
of marking the variable to get the compiler to cause a build error if it
was ever updated improperly, but that's starting to look very similar to
Mikulas' proposal.

James


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Mikulas Patocka



On Mon, 2 Jun 2014, Linus Torvalds wrote:

 On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra pet...@infradead.org wrote:
 
  And I can't say I'm a particular fan of these ops either, as alternative
  I'm almost inclined to just exclude parisc from using opt spinning.
 
 Please do.
 
 There is no way in hell that we should introduce a magic new
 atomic_pointer thing for parisc. And the idea somebody had to change
 ACCESS_ONCE() for PA-RISC (I'm not going to go back to find who to
 blame) is just horribly wrong too, since it's not even necessary for
 any normal use: the special load-and-store-zero instruction isn't
 actually used for real data, it's used only for the special
 spinlocks afaik, so doing it for all ACCESS_ONCE() users would be
 wrong even on PA-RISC. For any normal data, the usual just load the
 value, making sure the compiler doesn't reload it is perfectly fine -
 even on PA-RISC.
 
 Now, if PA-RISC was a major architecture, we'd have to figure this
 out. But as it is, PA-RISC is just about the shittiest RISC ever
 invented (with original sparc being a strong contender), and let's
 face it, nobody really uses it.  It's a fun project, but it is not
 something that we should use to mess up either ACCESS_ONCE() or the
 MCS locks.
 
 Just make PA-RISC use its own locks, not any of the new fancy ones.
 
   Linus

And what else do you want to do?

Peter Zijlstra said I've been using xchg() and cmpxchg() without such 
consideration for quite a while. - so it basically implies that the 
kernel is full of such races, mcs_spinlock is just the most visible one 
that crashes the kernel first.

It's not only parisc - tile32, arc, metag (maybe hexagon) are broken too, 
because they don't have cmpxchg in hardware.

We have atomic_t, atomic64_t, atomic_long_t that can be sanely used even 
on architectures without hardware cmpxchg - so I ask - why can't we have 
atomic_pointer_t with the same semantics? (pointer type conversion issues 
can be solved, as it is done in the PATCH v2)

Mikulas
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Linus Torvalds

On Mon, Jun 2, 2014 at 1:46 PM, Mikulas Patocka mpato...@redhat.com wrote:

 And what else do you want to do?

 Peter Zijlstra said I've been using xchg() and cmpxchg() without such
 consideration for quite a while. - so it basically implies that the
 kernel is full of such races, mcs_spinlock is just the most visible one
 that crashes the kernel first.

.. so your whole argument is bogus, because it doesn't actually fix
anything else.

Now, something that *would* fix something else is (for example) to
just make ACCESS_ONCE() a rvalue so that you cannot use it for
assignments, and then trying to sort out what happens then. It's
possible that the atomic_pointer_t would be a part of the solution
to that what happens then, but THERE IS NO WAY IN HELL we're adding
it for just one architecture and one use that doesn't warrant even
_existing_ on that architecture.

See what I'm saying?

You're not fixing the problem, you're fixing one unimportant detail
that isn't worth fixing that way.

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 10:09:35AM -0700, Linus Torvalds wrote:
 On Mon, Jun 2, 2014 at 9:25 AM, Peter Zijlstra pet...@infradead.org wrote:
 
  And I can't say I'm a particular fan of these ops either, as alternative
  I'm almost inclined to just exclude parisc from using opt spinning.
 
 Please do.
 
 There is no way in hell that we should introduce a magic new
 atomic_pointer thing for parisc. And the idea somebody had to change
 ACCESS_ONCE() for PA-RISC (I'm not going to go back to find who to
 blame) is just horribly wrong too, since it's not even necessary for
 any normal use: the special load-and-store-zero instruction isn't
 actually used for real data, it's used only for the special
 spinlocks afaik, so doing it for all ACCESS_ONCE() users would be
 wrong even on PA-RISC. For any normal data, the usual just load the
 value, making sure the compiler doesn't reload it is perfectly fine -
 even on PA-RISC.

Guilty to charges as read on suggesting PA-RISC-specific ACCESS_ONCE().  ;-)

Thanx, Paul

 Now, if PA-RISC was a major architecture, we'd have to figure this
 out. But as it is, PA-RISC is just about the shittiest RISC ever
 invented (with original sparc being a strong contender), and let's
 face it, nobody really uses it.  It's a fun project, but it is not
 something that we should use to mess up either ACCESS_ONCE() or the
 MCS locks.
 
 Just make PA-RISC use its own locks, not any of the new fancy ones.
 
   Linus
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 01:22:10PM -0700, Linus Torvalds wrote:
 On Mon, Jun 2, 2014 at 1:05 PM, Peter Zijlstra pet...@infradead.org wrote:
 
  So the question is, do you prefer subtly broken code or hard compile
  fails? Me, I go for the compile fail.
 
 The thing is, parisc has a perfectly fine cmpxchg implementation in
 practice, and ACCESS_ONCE() and friends work fine too for reading.
 
 What the use a spinlock approach cannot generally do is:
 
  - ACCESS_ONCE() to _write_ things doesn't work well. You really
 should use atomic_set().
 
  - you may not necessarily be able to mix partial updates (ie
 differently sized updates to the same thing) depending on just how the
 spinlock hashing works
 
 but both of those are really rare issues and don't affect normal code.
 
 I would not necessarily be opposed to splitting up ACCESS_ONCE() for
 reading and for writing, and maybe we could do something special for
 the writing path (which tends to be less ctitical). It's really mixing
 ACCESS_ONCE(x) to _set_ a value, together with atomic ops to update
 it, that ends up being problematic.

Knowing what I know now about how ACCESS_ONCE() is used, I would have
split it for reading and writing to begin with.  Where is that time
machine when you need it?  ;-)

 Maybe there are other issues I can't think of right now. But
 basically, parisc _can_ do cmpxchg, it's just that the code needs to
 be somewhat sanitized.
 
 Side note: some of the RCU code uses ACCESS_ONCE() for
 read-modify-write code, which is just f*cking crazy. The semantics are
 dubious, and it generally makes gcc create bad code too.

A couple of the places are admittedly overkill, for example the pair of:

ACCESS_ONCE(rsp-n_force_qs_lh)++;

which is just for debug statistics in any case.  I could put these back
to rsp-n_force_qs_lh++; without problems, if desired.  (Yeah, I know,
I am overly paranoid.)

However, these cases need a bit more care:

ACCESS_ONCE(rdp-qlen)++;
ACCESS_ONCE(rsp-n_barrier_done)++;
ACCESS_ONCE(sync_rcu_preempt_exp_count)++;

In the -qlen case, interrupts are disabled and the current CPU is
the only one who can write, so the read need not be volatile.  In the
-n_barrier_done, modifications are done holding -barrier_mutex, so again
the read need not be volatile.  In the sync_rcu_preempt_exp_count case,
modifications are done holding sync_rcu_preempt_exp_mutex, so once again,
the read need not be volatile.  So I could do something like:

ACCESS_ONCE(rdp-qlen) = rdp-qlen + 1;

But that still makes gcc generate bad code.

The reason I was not all that worried about this is that these are not
in fastpaths, and the last two are especially not in fastpaths.

Suggestions?

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Linus Torvalds

On Mon, Jun 2, 2014 at 2:02 PM, Paul E. McKenney
paul...@linux.vnet.ibm.com wrote:

 In the -qlen case, interrupts are disabled and the current CPU is
 the only one who can write, so the read need not be volatile.  In the
 -n_barrier_done, modifications are done holding -barrier_mutex, so again
 the read need not be volatile.  In the sync_rcu_preempt_exp_count case,
 modifications are done holding sync_rcu_preempt_exp_mutex, so once again,
 the read need not be volatile.  So I could do something like:

 ACCESS_ONCE(rdp-qlen) = rdp-qlen + 1;

 But that still makes gcc generate bad code

 The reason I was not all that worried about this is that these are not
 in fastpaths, and the last two are especially not in fastpaths.

 Suggestions?

So I think it probably *works*, but even so splitting it up to use
ACCESS_ONCE() on just the write is probably a better option, if only
because it would then make it much easier to change if we do end up
splitting reads and writes.

Because from a gcc code generation standpoint, using volatile will
always be horrible, because gcc will never be able to turn it into a
read-modify-write cycle. Arguable gcc _should_ be able to do that (it
is certainly allowable within the virtual machine definition), but I
understand why it doesn't (volatile? Let's not optimize anything at
all, because it's special).

So ACCESS_ONCE() + R-M-W operation is actually pretty much
guaranteed to be ACCESS_TWICE(), which may well be ok (performance
may not matter, and even when it does most architectures don't
actually have r-m-w instructions and when they do they aren't always
even faster), but I think it's just horribly horribly bad from a
conceptual and readability standpoint because it's so misleading.

So I'd actually rather see two explicit ACCESS_ONCE() calls - once to
read, once to write. Because that at least describes what is
happening, unlike the current situation.

Put another way: I can understand why you do it, and I can even agree
that it is correct from a functionality standpoint. But even despite
that all, I really don't like the construct very much..

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Mikulas Patocka



On Mon, 2 Jun 2014, Linus Torvalds wrote:

 On Mon, Jun 2, 2014 at 1:46 PM, Mikulas Patocka mpato...@redhat.com wrote:
 
  And what else do you want to do?
 
  Peter Zijlstra said I've been using xchg() and cmpxchg() without such
  consideration for quite a while. - so it basically implies that the
  kernel is full of such races, mcs_spinlock is just the most visible one
  that crashes the kernel first.
 
 .. so your whole argument is bogus, because it doesn't actually fix
 anything else.

 Now, something that *would* fix something else is (for example) to
 just make ACCESS_ONCE() a rvalue so that you cannot use it for
 assignments, and then trying to sort out what happens then. It's
 possible that the atomic_pointer_t would be a part of the solution
 to that what happens then, but THERE IS NO WAY IN HELL we're adding
 it for just one architecture and one use that doesn't warrant even
 _existing_ on that architecture.

The patch adds atomic_pointer_t for all architectures - it is in the 
common code and it is backed by atomic_long_t (that already exists for all 
architectures). There is no new arch-specific code at all.

When we have atomic_pointer_t, we can find the instances of xchg() and 
cmpxchg() and convert them to atomic_pointer_t (or to other atomic*_t 
types).

When we convert them all, we can drop xchg() and cmpxchg() at all (at 
least from architecture-neutral code).

The problem with xchg() and cmpxchg() is that they are very easy to 
misuse. Peter Zijlstra didn't know that they are not atomic w.r.t. normal 
stores, a lot of other people don't know it too - and if we allow these 
functions to be used, this race condition will reappear in the future 
again and again.

That's why I'm proposing atomic_pointer_t - it guarantees that this race 
condition can't be made.

 See what I'm saying?
 
 You're not fixing the problem, you're fixing one unimportant detail
 that isn't worth fixing that way.
 
Linus

Regarding reworking ACCESS_ONCE() for reads and writes - the problem is - 
how do you make people use it? ACCESS_ONCE() is already missing at a lot 
of places (it doesn't cause any visible bug on the condition that the 
compiler doesn't split the load or store to multiple accesses), I can 
assume that people will omit ATOMIC_ONCE_STORE() too even if we make it.

Mikulas
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 02:12:30PM -0700, Linus Torvalds wrote:
 On Mon, Jun 2, 2014 at 2:02 PM, Paul E. McKenney
 paul...@linux.vnet.ibm.com wrote:
 
  In the -qlen case, interrupts are disabled and the current CPU is
  the only one who can write, so the read need not be volatile.  In the
  -n_barrier_done, modifications are done holding -barrier_mutex, so again
  the read need not be volatile.  In the sync_rcu_preempt_exp_count case,
  modifications are done holding sync_rcu_preempt_exp_mutex, so once again,
  the read need not be volatile.  So I could do something like:
 
  ACCESS_ONCE(rdp-qlen) = rdp-qlen + 1;
 
  But that still makes gcc generate bad code
 
  The reason I was not all that worried about this is that these are not
  in fastpaths, and the last two are especially not in fastpaths.
 
  Suggestions?
 
 So I think it probably *works*, but even so splitting it up to use
 ACCESS_ONCE() on just the write is probably a better option, if only
 because it would then make it much easier to change if we do end up
 splitting reads and writes.
 
 Because from a gcc code generation standpoint, using volatile will
 always be horrible, because gcc will never be able to turn it into a
 read-modify-write cycle. Arguable gcc _should_ be able to do that (it
 is certainly allowable within the virtual machine definition), but I
 understand why it doesn't (volatile? Let's not optimize anything at
 all, because it's special).
 
 So ACCESS_ONCE() + R-M-W operation is actually pretty much
 guaranteed to be ACCESS_TWICE(), which may well be ok (performance
 may not matter, and even when it does most architectures don't
 actually have r-m-w instructions and when they do they aren't always
 even faster), but I think it's just horribly horribly bad from a
 conceptual and readability standpoint because it's so misleading.
 
 So I'd actually rather see two explicit ACCESS_ONCE() calls - once to
 read, once to write. Because that at least describes what is
 happening, unlike the current situation.
 
 Put another way: I can understand why you do it, and I can even agree
 that it is correct from a functionality standpoint. But even despite
 that all, I really don't like the construct very much..

OK, I have queued the following commit for 3.17.  Is this what you had
in mind?

Thanx, Paul



rcu: Eliminate read-modify-write ACCESS_ONCE() calls

RCU contains code of the following forms:

ACCESS_ONCE(x)++;
ACCESS_ONCE(x) += y;
ACCESS_ONCE(x) -= y;

Now these constructs do operate correctly, but they really result in a
pair of volatile accesses, one to do the load and another to do the store.
This can be confusing, as the casual reader might well assume that (for
example) gcc might generate a memory-to-memory add instruction for each
of these three cases.  In fact, gcc will do no such thing.  Also, there
is a good chance that the kernel will move to separate load and store
variants of ACCESS_ONCE(), and constructs like the above could easily
confuse both people and scripts attempting to make that sort of change.
Finally, most of RCU's read-modify-write uses of ACCESS_ONCE() really
only need the store to be volatile, so that the read-modify-write form
might be misleading.

This commit therefore changes the above forms in RCU so that each instance
of ACCESS_ONCE() either does a load or a store, but not both.  In a few
cases, ACCESS_ONCE() was not critical, for example, for maintaining 
statisitics.  In these cases, ACCESS_ONCE() has been dispensed with
entirely

Suggested-by: Linus Torvalds torva...@linux-foundation.org
Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com

diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
index c639556f3fa0..c0120279dead 100644
--- a/kernel/rcu/srcu.c
+++ b/kernel/rcu/srcu.c
@@ -295,12 +295,15 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
 int __srcu_read_lock(struct srcu_struct *sp)
 {
int idx;
+   unsigned long *lp;
 
idx = ACCESS_ONCE(sp-completed)  0x1;
preempt_disable();
-   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-c[idx]) += 1;
+   lp = this_cpu_ptr(sp-per_cpu_ref-c[idx]);
+   ACCESS_ONCE(*lp) = *lp + 1;
smp_mb(); /* B */  /* Avoid leaking the critical section. */
-   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-seq[idx]) += 1;
+   lp = this_cpu_ptr(sp-per_cpu_ref-seq[idx]);
+   ACCESS_ONCE(*lp) = *lp + 1;
preempt_enable();
return idx;
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d1c8e4a85b92..f0ed867070cd 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2275,7 +2275,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct 
rcu_data *rdp)
}
smp_mb(); /* List handling before counting for rcu_barrier(). */
rdp-qlen_lazy -= count_lazy;
-   ACCESS_ONCE(rdp-qlen) -= count;
+   ACCESS_ONCE(rdp-qlen) =

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Eric Dumazet

On Mon, 2014-06-02 at 15:08 -0700, Paul E. McKenney wrote:

 diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
 index c639556f3fa0..c0120279dead 100644
 --- a/kernel/rcu/srcu.c
 +++ b/kernel/rcu/srcu.c
 @@ -295,12 +295,15 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
  int __srcu_read_lock(struct srcu_struct *sp)
  {
   int idx;
 + unsigned long *lp;
  
   idx = ACCESS_ONCE(sp-completed)  0x1;
   preempt_disable();
 - ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-c[idx]) += 1;
 + lp = this_cpu_ptr(sp-per_cpu_ref-c[idx]);
 + ACCESS_ONCE(*lp) = *lp + 1;
   smp_mb(); /* B */  /* Avoid leaking the critical section. */
 - ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-seq[idx]) += 1;
 + lp = this_cpu_ptr(sp-per_cpu_ref-seq[idx]);
 + ACCESS_ONCE(*lp) = *lp + 1;
   preempt_enable();
   return idx;
  

This probably could use the following 

diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
index c639556f3fa0..3a97eb6f9076 100644
--- a/kernel/rcu/srcu.c
+++ b/kernel/rcu/srcu.c
@@ -298,9 +298,9 @@ int __srcu_read_lock(struct srcu_struct *sp)
 
idx = ACCESS_ONCE(sp-completed)  0x1;
preempt_disable();
-   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-c[idx]) += 1;
+   this_cpu_inc(sp-per_cpu_ref-c[idx]);
smp_mb(); /* B */  /* Avoid leaking the critical section. */
-   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-seq[idx]) += 1;
+   this_cpu_inc(sp-per_cpu_ref-seq[idx]);
preempt_enable();
return idx;
 }


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Linus Torvalds

On Mon, Jun 2, 2014 at 3:08 PM, Paul E. McKenney
paul...@linux.vnet.ibm.com wrote:

 rcu: Eliminate read-modify-write ACCESS_ONCE() calls

 preempt_disable();
 -   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-c[idx]) += 1;
 +   lp = this_cpu_ptr(sp-per_cpu_ref-c[idx]);
 +   ACCESS_ONCE(*lp) = *lp + 1;
 smp_mb(); /* B */  /* Avoid leaking the critical section. */
 -   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-seq[idx]) += 1;
 +   lp = this_cpu_ptr(sp-per_cpu_ref-seq[idx]);
 +   ACCESS_ONCE(*lp) = *lp + 1;
 preempt_enable();
 return idx;

What Eric said. This should just use this_cpu_inc() instead.
Particularly with the smp_mb() and the preempt_enable(), there's no
way that could/should leak, and the ACCESS_ONCE() seems pointless and
ugly.

And the good news is, gcc _will_ generate good code for that.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 03:44:48PM -0700, Eric Dumazet wrote:
 On Mon, 2014-06-02 at 15:08 -0700, Paul E. McKenney wrote:
 
  diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
  index c639556f3fa0..c0120279dead 100644
  --- a/kernel/rcu/srcu.c
  +++ b/kernel/rcu/srcu.c
  @@ -295,12 +295,15 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
   int __srcu_read_lock(struct srcu_struct *sp)
   {
  int idx;
  +   unsigned long *lp;
   
  idx = ACCESS_ONCE(sp-completed)  0x1;
  preempt_disable();
  -   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-c[idx]) += 1;
  +   lp = this_cpu_ptr(sp-per_cpu_ref-c[idx]);
  +   ACCESS_ONCE(*lp) = *lp + 1;
  smp_mb(); /* B */  /* Avoid leaking the critical section. */
  -   ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-seq[idx]) += 1;
  +   lp = this_cpu_ptr(sp-per_cpu_ref-seq[idx]);
  +   ACCESS_ONCE(*lp) = *lp + 1;
  preempt_enable();
  return idx;
   
 
 This probably could use the following 
 
 diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
 index c639556f3fa0..3a97eb6f9076 100644
 --- a/kernel/rcu/srcu.c
 +++ b/kernel/rcu/srcu.c
 @@ -298,9 +298,9 @@ int __srcu_read_lock(struct srcu_struct *sp)
 
   idx = ACCESS_ONCE(sp-completed)  0x1;
   preempt_disable();
 - ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-c[idx]) += 1;
 + this_cpu_inc(sp-per_cpu_ref-c[idx]);
   smp_mb(); /* B */  /* Avoid leaking the critical section. */
 - ACCESS_ONCE(this_cpu_ptr(sp-per_cpu_ref)-seq[idx]) += 1;
 + this_cpu_inc(sp-per_cpu_ref-seq[idx]);
   preempt_enable();
   return idx;
  }

Good point!

But given that I already have preemption disabled and given that
__srcu_read_lock() is not to be used by irq handlers, I should be able to
use __this_cpu_inc(), correct?  Just to avoid unnecessary irq disabling
on non-x86 platforms...

Seems to pass a quick build, so trying a bit heavier testing.

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Eric Dumazet

On Mon, 2014-06-02 at 16:17 -0700, Paul E. McKenney wrote:

 But given that I already have preemption disabled and given that
 __srcu_read_lock() is not to be used by irq handlers, I should be able to
 use __this_cpu_inc(), correct?  Just to avoid unnecessary irq disabling
 on non-x86 platforms...

Absolutely, __this_cpu_inc() is OK here.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

2014-06-02 Thread Paul E. McKenney

On Mon, Jun 02, 2014 at 04:53:44PM -0700, Eric Dumazet wrote:
 On Mon, 2014-06-02 at 16:17 -0700, Paul E. McKenney wrote:
 
  But given that I already have preemption disabled and given that
  __srcu_read_lock() is not to be used by irq handlers, I should be able to
  use __this_cpu_inc(), correct?  Just to avoid unnecessary irq disabling
  on non-x86 platforms...
 
 Absolutely, __this_cpu_inc() is OK here.

Cool, giving it a test...

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

86 matches

Mail list logo