Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-09-01 Thread Peter Zijlstra
On Thu, Sep 01, 2016 at 10:17:57PM +0800, Boqun Feng wrote: > On Thu, Sep 01, 2016 at 08:57:38AM +0200, Peter Zijlstra wrote: > Could there be some code that relies on the full barrier semantics of > schedule() to provide transitivity? Could, sure, who knows. RCU might, although Paul typically

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-09-01 Thread Peter Zijlstra
On Thu, Sep 01, 2016 at 10:17:57PM +0800, Boqun Feng wrote: > On Thu, Sep 01, 2016 at 08:57:38AM +0200, Peter Zijlstra wrote: > Could there be some code that relies on the full barrier semantics of > schedule() to provide transitivity? Could, sure, who knows. RCU might, although Paul typically

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-09-01 Thread Boqun Feng
On Thu, Sep 01, 2016 at 08:57:38AM +0200, Peter Zijlstra wrote: > On Thu, Sep 01, 2016 at 07:47:10AM +1000, Benjamin Herrenschmidt wrote: > > > > OK, for giggles, could you (or Balbir) check what happens if you take > > > that sync out? > > > The problem is no amount of testing can tell you it

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-09-01 Thread Boqun Feng
On Thu, Sep 01, 2016 at 08:57:38AM +0200, Peter Zijlstra wrote: > On Thu, Sep 01, 2016 at 07:47:10AM +1000, Benjamin Herrenschmidt wrote: > > > > OK, for giggles, could you (or Balbir) check what happens if you take > > > that sync out? > > > The problem is no amount of testing can tell you it

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-09-01 Thread Alexey Kardashevskiy
On 01/09/16 11:48, Alexey Kardashevskiy wrote: > On 31/08/16 17:28, Peter Zijlstra wrote: >> On Wed, Aug 31, 2016 at 01:41:33PM +1000, Balbir Singh wrote: >>> On 30/08/16 22:19, Peter Zijlstra wrote: On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: > > > The origin of

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-09-01 Thread Alexey Kardashevskiy
On 01/09/16 11:48, Alexey Kardashevskiy wrote: > On 31/08/16 17:28, Peter Zijlstra wrote: >> On Wed, Aug 31, 2016 at 01:41:33PM +1000, Balbir Singh wrote: >>> On 30/08/16 22:19, Peter Zijlstra wrote: On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: > > > The origin of

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-09-01 Thread Peter Zijlstra
On Thu, Sep 01, 2016 at 07:47:10AM +1000, Benjamin Herrenschmidt wrote: > > OK, for giggles, could you (or Balbir) check what happens if you take > > that sync out? > The problem is no amount of testing can tell you it works for sure :-) It breaking does prove the negative though, so still

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-09-01 Thread Peter Zijlstra
On Thu, Sep 01, 2016 at 07:47:10AM +1000, Benjamin Herrenschmidt wrote: > > OK, for giggles, could you (or Balbir) check what happens if you take > > that sync out? > The problem is no amount of testing can tell you it works for sure :-) It breaking does prove the negative though, so still

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-09-01 Thread Balbir Singh
On 01/09/16 07:47, Benjamin Herrenschmidt wrote: > On Wed, 2016-08-31 at 15:31 +0200, Peter Zijlstra wrote: >> On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt >> wrote: >> >>> >>> On powerpc we have a sync deep in _switch to achieve that. >> >> OK, for giggles, could you (or

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-09-01 Thread Balbir Singh
On 01/09/16 07:47, Benjamin Herrenschmidt wrote: > On Wed, 2016-08-31 at 15:31 +0200, Peter Zijlstra wrote: >> On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt >> wrote: >> >>> >>> On powerpc we have a sync deep in _switch to achieve that. >> >> OK, for giggles, could you (or

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Alexey Kardashevskiy
On 31/08/16 17:28, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 01:41:33PM +1000, Balbir Singh wrote: >> On 30/08/16 22:19, Peter Zijlstra wrote: >>> On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: The origin of the issue I've seen seems to be related to rwsem

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Alexey Kardashevskiy
On 31/08/16 17:28, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 01:41:33PM +1000, Balbir Singh wrote: >> On 30/08/16 22:19, Peter Zijlstra wrote: >>> On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: The origin of the issue I've seen seems to be related to rwsem

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Benjamin Herrenschmidt
On Wed, 2016-08-31 at 15:31 +0200, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt > wrote: > > > > > On powerpc we have a sync deep in _switch to achieve that. > > OK, for giggles, could you (or Balbir) check what happens if you take > that sync out? >

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Benjamin Herrenschmidt
On Wed, 2016-08-31 at 15:31 +0200, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt > wrote: > > > > > On powerpc we have a sync deep in _switch to achieve that. > > OK, for giggles, could you (or Balbir) check what happens if you take > that sync out? >

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Peter Zijlstra
On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt wrote: > On powerpc we have a sync deep in _switch to achieve that. OK, for giggles, could you (or Balbir) check what happens if you take that sync out? There should be enough serialization in the generic code to cover the case

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Peter Zijlstra
On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt wrote: > On powerpc we have a sync deep in _switch to achieve that. OK, for giggles, could you (or Balbir) check what happens if you take that sync out? There should be enough serialization in the generic code to cover the case

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Benjamin Herrenschmidt
On Wed, 2016-08-31 at 09:18 +0200, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt > wrote: > > > > It's always been a requirement that if you actually context switch > > a > > full mb() is implied ... > > > > > On powerpc we have a sync deep in _switch

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Benjamin Herrenschmidt
On Wed, 2016-08-31 at 09:20 +0200, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 07:25:01AM +1000, Benjamin Herrenschmidt wrote: > > > > On Tue, 2016-08-30 at 15:04 +0200, Oleg Nesterov wrote: > > > > > > > > > Confused... how this connects to UNLOCK+LOCK on rq->lock? A LOAD can > > > leak

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Benjamin Herrenschmidt
On Wed, 2016-08-31 at 09:18 +0200, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt > wrote: > > > > It's always been a requirement that if you actually context switch > > a > > full mb() is implied ... > > > > > On powerpc we have a sync deep in _switch

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Benjamin Herrenschmidt
On Wed, 2016-08-31 at 09:20 +0200, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 07:25:01AM +1000, Benjamin Herrenschmidt wrote: > > > > On Tue, 2016-08-30 at 15:04 +0200, Oleg Nesterov wrote: > > > > > > > > > Confused... how this connects to UNLOCK+LOCK on rq->lock? A LOAD can > > > leak

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Benjamin Herrenschmidt
On Wed, 2016-08-31 at 09:28 +0200, Peter Zijlstra wrote: > > What hardware do you see this on, is it shiny new Power8 chips which > have never before seen deep queues or something. Or is it 'regular' > old Power7 like stuff? Power8 which isn't *that* new these days... Cheers, Ben.

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Benjamin Herrenschmidt
On Wed, 2016-08-31 at 09:28 +0200, Peter Zijlstra wrote: > > What hardware do you see this on, is it shiny new Power8 chips which > have never before seen deep queues or something. Or is it 'regular' > old Power7 like stuff? Power8 which isn't *that* new these days... Cheers, Ben.

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Balbir Singh
On 31/08/16 17:28, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 01:41:33PM +1000, Balbir Singh wrote: >> On 30/08/16 22:19, Peter Zijlstra wrote: >>> On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: The origin of the issue I've seen seems to be related to

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Balbir Singh
On 31/08/16 17:28, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 01:41:33PM +1000, Balbir Singh wrote: >> On 30/08/16 22:19, Peter Zijlstra wrote: >>> On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: The origin of the issue I've seen seems to be related to

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Peter Zijlstra
On Wed, Aug 31, 2016 at 01:41:33PM +1000, Balbir Singh wrote: > On 30/08/16 22:19, Peter Zijlstra wrote: > > On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: > >> > >> > >> The origin of the issue I've seen seems to be related to > >> rwsem spin lock stealing. Basically I see the

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Peter Zijlstra
On Wed, Aug 31, 2016 at 01:41:33PM +1000, Balbir Singh wrote: > On 30/08/16 22:19, Peter Zijlstra wrote: > > On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: > >> > >> > >> The origin of the issue I've seen seems to be related to > >> rwsem spin lock stealing. Basically I see the

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Peter Zijlstra
On Wed, Aug 31, 2016 at 07:25:01AM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2016-08-30 at 15:04 +0200, Oleg Nesterov wrote: > > > > Confused... how this connects to UNLOCK+LOCK on rq->lock? A LOAD can > > leak into the critical section. > > > > But context switch should imply mb() we can

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Peter Zijlstra
On Wed, Aug 31, 2016 at 07:25:01AM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2016-08-30 at 15:04 +0200, Oleg Nesterov wrote: > > > > Confused... how this connects to UNLOCK+LOCK on rq->lock? A LOAD can > > leak into the critical section. > > > > But context switch should imply mb() we can

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Peter Zijlstra
On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt wrote: > It's always been a requirement that if you actually context switch a > full mb() is implied ... > On powerpc we have a sync deep in _switch to achieve that. OK, fair enough. I must've missed it in the x86 switch_to, must

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-31 Thread Peter Zijlstra
On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt wrote: > It's always been a requirement that if you actually context switch a > full mb() is implied ... > On powerpc we have a sync deep in _switch to achieve that. OK, fair enough. I must've missed it in the x86 switch_to, must

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Balbir Singh
On 30/08/16 22:19, Peter Zijlstra wrote: > On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: >> >> >> The origin of the issue I've seen seems to be related to >> rwsem spin lock stealing. Basically I see the system deadlock'd in the >> following state > > As Nick says (good to see

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Balbir Singh
On 30/08/16 22:19, Peter Zijlstra wrote: > On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: >> >> >> The origin of the issue I've seen seems to be related to >> rwsem spin lock stealing. Basically I see the system deadlock'd in the >> following state > > As Nick says (good to see

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Balbir Singh
On 30/08/16 22:58, Oleg Nesterov wrote: > On 08/30, Balbir Singh wrote: >> >> The origin of the issue I've seen seems to be related to >> rwsem spin lock stealing. Basically I see the system deadlock'd in the >> following state >> >> I have a system with multiple threads and >> >> Most of the

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Balbir Singh
On 30/08/16 22:58, Oleg Nesterov wrote: > On 08/30, Balbir Singh wrote: >> >> The origin of the issue I've seen seems to be related to >> rwsem spin lock stealing. Basically I see the system deadlock'd in the >> following state >> >> I have a system with multiple threads and >> >> Most of the

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Benjamin Herrenschmidt
On Tue, 2016-08-30 at 20:34 +0200, Peter Zijlstra wrote: > > I'm not actually sure it does. There is the comment from 8643cda549ca4 > which explain the program order guarantees. > > But I'm not sure who or what would simply a full smp_mb() when you call > schedule() -- I mean, its true on x86,

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Benjamin Herrenschmidt
On Tue, 2016-08-30 at 20:34 +0200, Peter Zijlstra wrote: > > I'm not actually sure it does. There is the comment from 8643cda549ca4 > which explain the program order guarantees. > > But I'm not sure who or what would simply a full smp_mb() when you call > schedule() -- I mean, its true on x86,

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Benjamin Herrenschmidt
On Tue, 2016-08-30 at 15:04 +0200, Oleg Nesterov wrote: > > Confused... how this connects to UNLOCK+LOCK on rq->lock? A LOAD can > leak into the critical section. > > But context switch should imply mb() we can rely on? Between setting of ->on_rq and returning to the task so it can change its

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Benjamin Herrenschmidt
On Tue, 2016-08-30 at 15:04 +0200, Oleg Nesterov wrote: > > Confused... how this connects to UNLOCK+LOCK on rq->lock? A LOAD can > leak into the critical section. > > But context switch should imply mb() we can rely on? Between setting of ->on_rq and returning to the task so it can change its

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Peter Zijlstra
On Tue, Aug 30, 2016 at 06:57:47PM +0200, Oleg Nesterov wrote: > On 08/30, Peter Zijlstra wrote: > > On Tue, Aug 30, 2016 at 03:04:27PM +0200, Oleg Nesterov wrote: > > > But context switch should imply mb() we can rely on? > > > > Not sure it should, on x86 switch_mm does a CR3 write and that is

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Peter Zijlstra
On Tue, Aug 30, 2016 at 06:57:47PM +0200, Oleg Nesterov wrote: > On 08/30, Peter Zijlstra wrote: > > On Tue, Aug 30, 2016 at 03:04:27PM +0200, Oleg Nesterov wrote: > > > But context switch should imply mb() we can rely on? > > > > Not sure it should, on x86 switch_mm does a CR3 write and that is

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Oleg Nesterov
On 08/30, Peter Zijlstra wrote: > On Tue, Aug 30, 2016 at 03:04:27PM +0200, Oleg Nesterov wrote: > > On 08/30, Peter Zijlstra wrote: > > > > > > /* > > >* Ensure we load p->on_rq _after_ p->state, otherwise it would > > >* be possible to, falsely, observe p->on_rq == 0 and get stuck > >

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Oleg Nesterov
On 08/30, Peter Zijlstra wrote: > On Tue, Aug 30, 2016 at 03:04:27PM +0200, Oleg Nesterov wrote: > > On 08/30, Peter Zijlstra wrote: > > > > > > /* > > >* Ensure we load p->on_rq _after_ p->state, otherwise it would > > >* be possible to, falsely, observe p->on_rq == 0 and get stuck > >

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Peter Zijlstra
On Tue, Aug 30, 2016 at 03:04:27PM +0200, Oleg Nesterov wrote: > On 08/30, Peter Zijlstra wrote: > > > > /* > > * Ensure we load p->on_rq _after_ p->state, otherwise it would > > * be possible to, falsely, observe p->on_rq == 0 and get stuck > > * in smp_cond_load_acquire()

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Peter Zijlstra
On Tue, Aug 30, 2016 at 03:04:27PM +0200, Oleg Nesterov wrote: > On 08/30, Peter Zijlstra wrote: > > > > /* > > * Ensure we load p->on_rq _after_ p->state, otherwise it would > > * be possible to, falsely, observe p->on_rq == 0 and get stuck > > * in smp_cond_load_acquire()

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Oleg Nesterov
On 08/30, Peter Zijlstra wrote: > > /* >* Ensure we load p->on_rq _after_ p->state, otherwise it would >* be possible to, falsely, observe p->on_rq == 0 and get stuck >* in smp_cond_load_acquire() below. >* >* sched_ttwu_pending()

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Oleg Nesterov
On 08/30, Peter Zijlstra wrote: > > /* >* Ensure we load p->on_rq _after_ p->state, otherwise it would >* be possible to, falsely, observe p->on_rq == 0 and get stuck >* in smp_cond_load_acquire() below. >* >* sched_ttwu_pending()

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Oleg Nesterov
On 08/30, Balbir Singh wrote: > > The origin of the issue I've seen seems to be related to > rwsem spin lock stealing. Basically I see the system deadlock'd in the > following state > > I have a system with multiple threads and > > Most of the threads are stuck doing > > [67272.593915] ---

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Oleg Nesterov
On 08/30, Balbir Singh wrote: > > The origin of the issue I've seen seems to be related to > rwsem spin lock stealing. Basically I see the system deadlock'd in the > following state > > I have a system with multiple threads and > > Most of the threads are stuck doing > > [67272.593915] ---

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Peter Zijlstra
On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: > > > The origin of the issue I've seen seems to be related to > rwsem spin lock stealing. Basically I see the system deadlock'd in the > following state As Nick says (good to see you're back Nick!), this is unrelated to rwsems.

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Peter Zijlstra
On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote: > > > The origin of the issue I've seen seems to be related to > rwsem spin lock stealing. Basically I see the system deadlock'd in the > following state As Nick says (good to see you're back Nick!), this is unrelated to rwsems.

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Nicholas Piggin
On Tue, 30 Aug 2016 18:49:37 +1000 Balbir Singh wrote: > The origin of the issue I've seen seems to be related to > rwsem spin lock stealing. Basically I see the system deadlock'd in the > following state BTW. this is not really to do with rwsems, but purely a scheduler

Re: [RFC][PATCH] Fix a race between rwsem and the scheduler

2016-08-30 Thread Nicholas Piggin
On Tue, 30 Aug 2016 18:49:37 +1000 Balbir Singh wrote: > The origin of the issue I've seen seems to be related to > rwsem spin lock stealing. Basically I see the system deadlock'd in the > following state BTW. this is not really to do with rwsems, but purely a scheduler bug. It was seen in