On Wed, Aug 31, 2016 at 01:41:33PM +1000, Balbir Singh wrote:
> On 30/08/16 22:19, Peter Zijlstra wrote:
> > On Tue, Aug 30, 2016 at 06:49:37PM +1000, Balbir Singh wrote:
> >>
> >>
> >> The origin of the issue I've seen seems to be related to
> >> rwsem spin lock stealing. Basically I see the system deadlock'd in the
> >> following state
> > 
> > As Nick says (good to see you're back Nick!), this is unrelated to
> > rwsems.
> > 
> > This is true for pretty much every blocking wait loop out there, they
> > all do:
> > 
> >     for (;;) {
> >             current->state = UNINTERRUPTIBLE;
> >             smp_mb();
> >             if (cond)
> >                     break;
> >             schedule();
> >     }
> >     current->state = RUNNING;
> > 
> > Which, if the wakeup is spurious, is just the pattern you need.
> 
> Yes True! My bad Alexey had seen the same basic pattern, I should have been 
> clearer
> in my commit log. Should I resend the patch?

Yes please.

> > There isn't an MB there. The best I can do is UNLOCK+LOCK, which, thanks
> > to PPC, is _not_ MB. It is however sufficient for this case.
> > 
> 
> The MB comes from the __switch_to() in schedule(). Ben mentioned it in a 
> different thread.

Right, although even without that, there is sufficient ordering, as the
rq unlock from the wakeup, coupled with the rq lock from the schedule
already form a load-store barrier.

> > Now, this has been present for a fair while, I suspect ever since we
> > reworked the wakeup path to not use rq->lock twice. Curious you only now
> > hit it.
> > 
> 
> Yes, I just hit it a a week or two back and I needed to collect data to
> explain why p->on_rq got to 0. Hitting it requires extreme stress -- for me
> I needed a system with large threads and less memory running stress-ng.
> Reproducing the problem takes an unpredictable amount of time.

What hardware do you see this on, is it shiny new Power8 chips which
have never before seen deep queues or something. Or is it 'regular' old
Power7 like stuff?

Reply via email to