On Mon, Apr 10, 2017 at 01:48:13PM -0400, Alan Stern wrote:
> On Mon, 10 Apr 2017, Paul E. McKenney wrote:
>
> > On Mon, Apr 10, 2017 at 12:20:53PM -0400, Alan Stern wrote:
> > > On Mon, 10 Apr 2017, Paul E. McKenney wrote:
> > >
> > > > > But I would like to get this matter settled first. Is the explicit
> > > > > barrier truly necessary?
> > > >
> > > > If you are using wait_event()/wake_up() or friends, the explicit
> > > > barrier -is- necessary. To see this, look at v4.10's wait_event():
> > > >
> > > > #define wait_event(wq, condition)
> > > > \
> > > > do {
> > > > \
> > > > might_sleep();
> > > > \
> > > > if (condition)
> > > > \
> > > > break;
> > > > \
> > > > __wait_event(wq, condition);
> > > > \
> > > > } while (0)
> > > >
> > > > As you can see, if the condition is set just before the wait_event()
> > > > macro checks it, there is no ordering whatsoever.
> > >
> > > This is true, but it is not relevant to the question I was asking.
> >
> > Apologies! What I get for answering email too early on Monday, I guess...
> >
> > > > And if wake_up()
> > > > finds nothing to wake up, there is no relevant ordering on that side,
> > > > either.
> > > >
> > > > So you had better supply your own ordering, period, end of story.
> > >
> > > The question is: Exactly what ordering do I need to supply? The
> > > ordering among my own variables is okay; I know how to deal with that.
> > > But what about the ordering between my variables and current->state?
> >
> > The ordering with current->state is sadly not relevant because it is
> > only touched if wake_up() actually wakes the process up.
>
> Well, it is _written_ only if wake_up() actually wakes the process up.
> But it is _read_ in every case.
For wake_up_process(), agreed. But for wake_up(), if the process
doing the wait_event() saw the changed state on the very first check,
the waking process won't have any way of gaining a reference to the
"awakened" task, so cannot access its ->state.
> > > For example, __wait_event() calls prepare_to_wait(), which calls
> > > set_current_state(), which calls smp_store_mb(), thereby inserting a
> > > full memory barrier between setting current->state and checking the
> > > condition. But I didn't see any comparable barrier inserted by
> > > wake_up(), between setting the condition and checking task->state.
> > >
> > > However, now that I look more closely, I do see that wakeup_process()
> > > calls try_to_wake_up(), which begins with:
> > >
> > > /*
> > > * If we are going to wake up a thread waiting for CONDITION we
> > > * need to ensure that CONDITION=1 done by the caller can not be
> > > * reordered with p->state check below. This pairs with mb() in
> > > * set_current_state() the waiting thread does.
> > > */
> > > smp_mb__before_spinlock();
> > > raw_spin_lock_irqsave(&p->pi_lock, flags);
> > > if (!(p->state & state))
> > >
> > > So it does insert a full barrier after all, and there is nothing to
> > > worry about.
> >
> > Nice!
>
> It looks like the other wakeup pathways end up funnelling through
> try_to_wake_up(), so this is true in general.
Only for wake_up_process() and friends, not for wake_up() and friends.
Again, although wake_up_process() unconditionally checks the awakened
processm, wake_up() doesn't even have any way of knowing what process
it woke up in the case where the "awakened" process didn't actually sleep.
But even in the wake_up_process() case, don't we need the wait-side
process to have appropriate barriers (or locks or whatever) manually
inserted?
> > Hmmm...
> >
> > Another valid (and I believe more common) idiom is this:
> >
> > spin_lock(&mylock);
> > changes_that_must_be_visible_to_woken_thread();
> > WRITE_ONCE(need_wake_up, true);
> > spin_unlock(&mylock);
> >
> > ---
> >
> > wait_event(wq, READ_ONCE(need_wake_up));
> > spin_lock(&mylock);
> > access_variables_used_by_waking_thread();
> > spin_unlock(&mylock);
> >
> > In this case, the locks do all the required ordering.
> >
> > > This also means that the analysis provided by Thinh Nguyen in the
> > > original patch description is wrong.
> >
> > And that the bug is elsewhere?
>
> Presumably. On the other hand, Thinh Nguyen claimed to have narrowed
> the problem down to this particular mechanism. The driver in question
> in drivers/usb/gadget/function/f_mass_storage.c. The waker routines
> are bulk_out_complete()/wakeup_thread(), which do:
>
> // bulk_out_complete()
> bh->state = BH_STATE_FULL;
>
> // wakeup_thread()
> smp_wmb(); /* ensure the write of bh->state is complete */
> /* Tell the main thread that something has happened */
> common->thread_wakeup_needed = 1;
> if (common->thread_task)
> wake_up_process(common->thread_task);
>
> and the waiters are get_next_command()/sleep_thread(), which do:
>
> // get_next_command()
> while (bh->state == BH_STATE_EMPTY) {
>
> // sleep_thread()
> for (;;) {
> if (can_freeze)
> try_to_freeze();
> set_current_state(TASK_INTERRUPTIBLE);
> if (signal_pending(current)) {
> rc = -EINTR;
> break;
> }
> if (common->thread_wakeup_needed)
> break;
> schedule();
> }
> __set_current_state(TASK_RUNNING);
> common->thread_wakeup_needed = 0;
> smp_rmb(); /* ensure the latest bh->state is visible */
>
> }
>
> and he said that the problem was caused by the waiter seeing
> thread_wakeup_needed == 0, so the wakeup was getting lost.
>
> Hmmm, I suppose it's possible that the waker's thread_wakeup_needed = 1
> could race with the waiter's thread_wakeup_needed = 0. If there are
> two waits in quick succession, the second could get lost. The pattern
> would be:
>
> bh->state = BH_STATE_FULL;
> smp_wmb();
> thread_wakeup_needed = 0; thread_wakeup_needed = 1;
> smp_rmb();
> if (bh->state != BH_STATE_FULL)
> sleep again...
>
> This is the so-called R pattern, and it also needs full memory barriers
> on both sides. The barriers we have are not sufficient. (This is an
> indication that the driver's design needs to be re-thought.) As it is,
> the waiter's thread_wakeup_needed = 0 can overwrite the waker's
> thread_wakeup_needed = 1 while the waiter's read of bh->state then
> fails to see the waker's write. (This analysis is similar to but
> different from the one in the patch description.)
>
> To fix this problem, both the smp_rmb() in sleep_thread() and the
> smp_wmb() in wakeup_thread() should be changed to smp_mb().
Good catch!
However, if this failure was seen on x86, there is something else going
on as well.
Thanx, Paul
> Felipe, was this patch meant to solve the problem you encountered in
> your "Memory barrier needed with wake_up_process()?" email thread last
> fall?
>
> Alan Stern
>
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html