Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Paul E. McKenney
On Mon, Apr 10, 2017 at 03:00:34PM -0400, Alan Stern wrote:
> On Mon, 10 Apr 2017, Paul E. McKenney wrote:
> 
> > > > The ordering with current->state is sadly not relevant because it is
> > > > only touched if wake_up() actually wakes the process up.
> > > 
> > > Well, it is _written_ only if wake_up() actually wakes the process up.  
> > > But it is _read_ in every case.
> > 
> > For wake_up_process(), agreed.  But for wake_up(), if the process
> > doing the wait_event() saw the changed state on the very first check,
> > the waking process won't have any way of gaining a reference to the
> > "awakened" task, so cannot access its ->state.
> 
> True.  But that would be okay, since the waiter has definitely seen the
> changed state.  I was concerned about the possibility that there was no
> wakeup and the waiter did _not_ see the changed state.  That's how you 
> get tasks staying asleep indefinitely when they should be running.
> 
> > > It looks like the other wakeup pathways end up funnelling through 
> > > try_to_wake_up(), so this is true in general.
> > 
> > Only for wake_up_process() and friends, not for wake_up() and friends.
> > Again, although wake_up_process() unconditionally checks the awakened
> > processm, wake_up() doesn't even have any way of knowing what process
> > it woke up in the case where the "awakened" process didn't actually sleep.
> 
> Like the above, this would be okay.
> 
> > But even in the wake_up_process() case, don't we need the wait-side
> > process to have appropriate barriers (or locks or whatever) manually
> > inserted?
> 
> Only for accesses among the driver's own variables.  There's no need to
> order the local variables against current->state; as we have seen,
> that's all handled for us.

Fair enough.  However, the kernel documentation needs to consider
the driver's own variables.

> > > > > This also means that the analysis provided by Thinh Nguyen in the 
> > > > > original patch description is wrong.
> > > > 
> > > > And that the bug is elsewhere?
> > > 
> > > Presumably.  On the other hand, Thinh Nguyen claimed to have narrowed
> > > the problem down to this particular mechanism.  The driver in question
> > > in drivers/usb/gadget/function/f_mass_storage.c.  The waker routines
> > > are bulk_out_complete()/wakeup_thread(), which do:
> > > 
> > >   // bulk_out_complete()
> > >   bh->state = BH_STATE_FULL;
> > > 
> > >   // wakeup_thread()
> > >   smp_wmb();  /* ensure the write of bh->state is complete */
> > >   /* Tell the main thread that something has happened */
> > >   common->thread_wakeup_needed = 1;
> > >   if (common->thread_task)
> > >   wake_up_process(common->thread_task);
> > > 
> > > and the waiters are get_next_command()/sleep_thread(), which do:
> > > 
> > > // get_next_command()
> > > while (bh->state == BH_STATE_EMPTY) {
> > > 
> > >   // sleep_thread()
> > >   for (;;) {
> > >   if (can_freeze)
> > >   try_to_freeze();
> > >   set_current_state(TASK_INTERRUPTIBLE);
> > >   if (signal_pending(current)) {
> > >   rc = -EINTR;
> > >   break;
> > >   }
> > >   if (common->thread_wakeup_needed)
> > >   break;
> > >   schedule();
> > >   }
> > >   __set_current_state(TASK_RUNNING);
> > >   common->thread_wakeup_needed = 0;
> > >   smp_rmb();  /* ensure the latest bh->state is visible */
> > > 
> > > }
> > > 
> > > and he said that the problem was caused by the waiter seeing 
> > > thread_wakeup_needed == 0, so the wakeup was getting lost.
> > > 
> > > Hmmm, I suppose it's possible that the waker's thread_wakeup_needed = 1
> > > could race with the waiter's thread_wakeup_needed = 0.  If there are
> > > two waits in quick succession, the second could get lost.  The pattern
> > > would be:
> > > 
> > >   bh->state = BH_STATE_FULL;
> > >   smp_wmb();
> > >   thread_wakeup_needed = 0;   thread_wakeup_needed = 1;
> > >   smp_rmb();
> > >   if (bh->state != BH_STATE_FULL)
> > >   sleep again...
> > > 
> >

Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Paul E. McKenney
On Mon, Apr 10, 2017 at 01:48:13PM -0400, Alan Stern wrote:
> On Mon, 10 Apr 2017, Paul E. McKenney wrote:
> 
> > On Mon, Apr 10, 2017 at 12:20:53PM -0400, Alan Stern wrote:
> > > On Mon, 10 Apr 2017, Paul E. McKenney wrote:
> > > 
> > > > > But I would like to get this matter settled first.  Is the explicit 
> > > > > barrier truly necessary?
> > > > 
> > > > If you are using wait_event()/wake_up() or friends, the explicit
> > > > barrier -is- necessary.  To see this, look at v4.10's wait_event():
> > > > 
> > > > #define wait_event(wq, condition)   
> > > > \
> > > > do {
> > > > \
> > > > might_sleep();  
> > > > \
> > > > if (condition)  
> > > > \
> > > > break;  
> > > > \
> > > > __wait_event(wq, condition);
> > > > \
> > > > } while (0)
> > > > 
> > > > As you can see, if the condition is set just before the wait_event()
> > > > macro checks it, there is no ordering whatsoever.
> > > 
> > > This is true, but it is not relevant to the question I was asking.
> > 
> > Apologies!  What I get for answering email too early on Monday, I guess...
> > 
> > > >  And if wake_up()
> > > > finds nothing to wake up, there is no relevant ordering on that side,
> > > > either.
> > > > 
> > > > So you had better supply your own ordering, period, end of story.
> > > 
> > > The question is: Exactly what ordering do I need to supply?  The 
> > > ordering among my own variables is okay; I know how to deal with that.  
> > > But what about the ordering between my variables and current->state?
> > 
> > The ordering with current->state is sadly not relevant because it is
> > only touched if wake_up() actually wakes the process up.
> 
> Well, it is _written_ only if wake_up() actually wakes the process up.  
> But it is _read_ in every case.

For wake_up_process(), agreed.  But for wake_up(), if the process
doing the wait_event() saw the changed state on the very first check,
the waking process won't have any way of gaining a reference to the
"awakened" task, so cannot access its ->state.

> > > For example, __wait_event() calls prepare_to_wait(), which calls
> > > set_current_state(), which calls smp_store_mb(), thereby inserting a
> > > full memory barrier between setting current->state and checking the
> > > condition.  But I didn't see any comparable barrier inserted by
> > > wake_up(), between setting the condition and checking task->state.
> > > 
> > > However, now that I look more closely, I do see that wakeup_process() 
> > > calls try_to_wake_up(), which begins with:
> > > 
> > >   /*
> > >* If we are going to wake up a thread waiting for CONDITION we
> > >* need to ensure that CONDITION=1 done by the caller can not be
> > >* reordered with p->state check below. This pairs with mb() in
> > >* set_current_state() the waiting thread does.
> > >*/
> > >   smp_mb__before_spinlock();
> > >   raw_spin_lock_irqsave(&p->pi_lock, flags);
> > >   if (!(p->state & state))
> > > 
> > > So it does insert a full barrier after all, and there is nothing to 
> > > worry about.
> > 
> > Nice!
> 
> It looks like the other wakeup pathways end up funnelling through 
> try_to_wake_up(), so this is true in general.

Only for wake_up_process() and friends, not for wake_up() and friends.
Again, although wake_up_process() unconditionally checks the awakened
processm, wake_up() doesn't even have any way of knowing what process
it woke up in the case where the "awakened" process didn't actually sleep.

But even in the wake_up_process() case, don't we need the wait-side
process to have appropriate barriers (or locks or whatever) manually
inserted?

> > Hmmm...
> > 
> > Another valid (and I believe more common) idiom is this:
> > 
> > spin_lock(&mylock);
> > changes_that_must_be_visible_to_woken_thread();
> > WRITE_ONCE(need_wake_up, true);
> > spin_unlock(&mylock);
> > 
> >   

Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Paul E. McKenney
On Mon, Apr 10, 2017 at 12:20:53PM -0400, Alan Stern wrote:
> On Mon, 10 Apr 2017, Paul E. McKenney wrote:
> 
> > > But I would like to get this matter settled first.  Is the explicit 
> > > barrier truly necessary?
> > 
> > If you are using wait_event()/wake_up() or friends, the explicit
> > barrier -is- necessary.  To see this, look at v4.10's wait_event():
> > 
> > #define wait_event(wq, condition)   \
> > do {\
> > might_sleep();  \
> > if (condition)  \
> > break;  \
> > __wait_event(wq, condition);\
> > } while (0)
> > 
> > As you can see, if the condition is set just before the wait_event()
> > macro checks it, there is no ordering whatsoever.
> 
> This is true, but it is not relevant to the question I was asking.

Apologies!  What I get for answering email too early on Monday, I guess...

> >  And if wake_up()
> > finds nothing to wake up, there is no relevant ordering on that side,
> > either.
> > 
> > So you had better supply your own ordering, period, end of story.
> 
> The question is: Exactly what ordering do I need to supply?  The 
> ordering among my own variables is okay; I know how to deal with that.  
> But what about the ordering between my variables and current->state?

The ordering with current->state is sadly not relevant because it is
only touched if wake_up() actually wakes the process up.

> For example, __wait_event() calls prepare_to_wait(), which calls
> set_current_state(), which calls smp_store_mb(), thereby inserting a
> full memory barrier between setting current->state and checking the
> condition.  But I didn't see any comparable barrier inserted by
> wake_up(), between setting the condition and checking task->state.
> 
> However, now that I look more closely, I do see that wakeup_process() 
> calls try_to_wake_up(), which begins with:
> 
>   /*
>* If we are going to wake up a thread waiting for CONDITION we
>* need to ensure that CONDITION=1 done by the caller can not be
>* reordered with p->state check below. This pairs with mb() in
>* set_current_state() the waiting thread does.
>*/
>   smp_mb__before_spinlock();
>   raw_spin_lock_irqsave(&p->pi_lock, flags);
>   if (!(p->state & state))
> 
> So it does insert a full barrier after all, and there is nothing to 
> worry about.

Nice!

Hmmm...

Another valid (and I believe more common) idiom is this:

spin_lock(&mylock);
changes_that_must_be_visible_to_woken_thread();
WRITE_ONCE(need_wake_up, true);
spin_unlock(&mylock);

---

wait_event(wq, READ_ONCE(need_wake_up));
spin_lock(&mylock);
access_variables_used_by_waking_thread();
spin_unlock(&mylock);

In this case, the locks do all the required ordering.

> This also means that the analysis provided by Thinh Nguyen in the 
> original patch description is wrong.

And that the bug is elsewhere?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Paul E. McKenney
On Mon, Apr 10, 2017 at 11:01:59AM -0400, Alan Stern wrote:
> On Mon, 10 Apr 2017, Felipe Balbi wrote:
> 
> > Hi,
> > 
> > Thinh Nguyen  writes:
> > > f_mass_storage has a memorry barrier issue with the sleep and wake
> > > functions that can cause a deadlock. This results in intermittent hangs
> > > during MSC file transfer. The host will reset the device after receiving
> > > no response to resume the transfer. This issue is seen when dwc3 is
> > > processing 2 transfer-in-progress events at the same time, invoking
> > > completion handlers for CSW and CBW. Also this issue occurs depending on
> > > the system timing and latency.
> > >
> > > To increase the chance to hit this issue, you can force dwc3 driver to
> > > wait and process those 2 events at once by adding a small delay (~100us)
> > > in dwc3_check_event_buf() whenever the request is for CSW and read the
> > > event count again. Avoid debugging with printk and ftrace as extra
> > > delays and memory barrier will mask this issue.
> 
> > >
> > > Scenario which can lead to failure:
> > > ---
> > > 1) The main thread sleeps and waits for the next command.
> > > 2) bulk_in_complete() wakes up main thread for CSW.
> > > 3) bulk_out_complete() tries to wake up the running main thread for CBW.
> > > 4) thread_wakeup_needed is not loaded with correct value in 
> > > sleep_thread().
> > > 5) Main thread goes to sleep.
> > >
> > > See more detail below:
> > >
> > > Note the 2 critical variables.
> > >  * common->thread_wakeup_needed
> > >  * bh->state
> > >
> > > Eval'd  | CPU0 (main thread) | CPU1
> > > +--+--
> > > | get_next_command() |
> > > | ...|
> > > while(1)| while (bh->state != BUF_STATE_FULL) {  |
> > > |  for (;;) {|
> > > |set_current_state(TASK_INTERRUPTIBLE);|
> > > if(0)   |if (thread_wakeup_needed)   |
> > > |schedule()  |
> > >   |  | bulk_in_complete() {
> > >   |  | smp_wmb();
> > >   |  | bh->state = BUF_STATE_EMPTY
> > >   |  | smp_wmb();
> > >   |  | thread_wakeup_needed = 1
> > >   |  | wake_up_process()
> > >   |  | }
> > >   |  /* 2nd for loop iteration */|
> > > |  for (;;) {|
> > > |set_current_state(TASK_INTERRUPTIBLE);|
> > > if(1)   |if (thread_wakeup_needed)   |
> > > |  break |
> > > |  } |
> > >   |  | bulk_out_complete() {
> > >   |  | smp_wmb();
> > >   |  __set_current_state(TASK_RUNNING);  |
> > >   |  thread_wakeup_needed = 0;   |
> > >   |  smp_rmb();  |
> > >   |  |
> > >   |  /* 2nd while loop iteration */  |
> > > while(1)| while (bh->state != BUF_STATE_FULL) {  |
> > >   |  | bh->state = BUF_STATE_FULL;
> > >   |  | smp_wmb();
> > >   |   -->| thread_wakeup_needed = 1
> > >   |  | wake_up_process()
> > >   |  | }
> > > |  for (;;) {|
> > > |set_current_state(TASK_INTERRUPTIBLE);|
> > > if(0)   |if (thread_wakeup_needed)   |<--
> > > |schedule()  |
> > > ||
> > >   |  |
> > >
> > > thread_wakeup_needed is not guaranteed to be loaded before checking it.
> > > Note that this happens when wake_up_process() is called on a running
> > > process.
> > >
> > > This patch fixes this issue by explicitly use smp_mb() after clearing of
> > > thread_wakeup_needed in the sleep function. It serializes the order in
> > > which thread_wakeup_needed is loaded and stored, thus prevent loading
> > > the wrong value when checking it in the sleeper.
> > >
> > > However, a better solution in the future would be to use wait_queue
> > > method that takes care of managing memory barrier between waker and
> > > waiter.
> > >
> > > See DEFINE_WAIT_FUNC comment in kernel scheduling wait.c as this
> > > solution is similar to its implementation.
> > >
> > > Signed-off-by: Thinh Nguyen 
> > > ---
> > >  drivers/usb/gadget/function/f_mass_storage.c | 2 +-
> > >  1 file c

Re: Memory barrier needed with wake_up_process()?

2016-09-03 Thread Paul E. McKenney
On Fri, Sep 02, 2016 at 04:29:19PM -0400, Alan Stern wrote:
> On Fri, 2 Sep 2016, Paul E. McKenney wrote:
> 
> > On Fri, Sep 02, 2016 at 02:10:13PM -0400, Alan Stern wrote:
> > > Paul, Peter, and Ingo:
> > > 
> > > This must have come up before, but I don't know what was decided.
> > > 
> > > Isn't it often true that a memory barrier is needed before a call to 
> > > wake_up_process()?  A typical scenario might look like this:
> > > 
> > >   CPU 0
> > >   -
> > >   for (;;) {
> > >   set_current_state(TASK_INTERRUPTIBLE);
> > >   if (signal_pending(current))
> > >   break;
> > >   if (wakeup_flag)
> > >   break;
> > >   schedule();
> > >   }
> > >   __set_current_state(TASK_RUNNING);
> > >   wakeup_flag = 0;
> > > 
> > > 
> > >   CPU 1
> > >   -
> > >   wakeup_flag = 1;
> > >   wake_up_process(my_task);
> > > 
> > > The underlying pattern is:
> > > 
> > >   CPU 0   CPU 1
> > >   -   -
> > >   write current->statewrite wakeup_flag
> > >   smp_mb();
> > >   read wakeup_flagread my_task->state
> > > 
> > > where set_current_state() does the write to current->state and 
> > > automatically adds the smp_mb(), and wake_up_process() reads 
> > > my_task->state to see whether the task needs to be woken up.
> > > 
> > > The kerneldoc for wake_up_process() says that it has no implied memory
> > > barrier if it doesn't actually wake anything up.  And even when it
> > > does, the implied barrier is only smp_wmb, not smp_mb.
> > > 
> > > This is the so-called SB (Store Buffer) pattern, which is well known to
> > > require a full smp_mb on both sides.  Since wake_up_process() doesn't
> > > include smp_mb(), isn't it correct that the caller must add it
> > > explicitly?
> > > 
> > > In other words, shouldn't the code for CPU 1 really be:
> > > 
> > >   wakeup_flag = 1;
> > >   smp_mb();
> > >   wake_up_process(task);
> > > 
> > > If my reasoning is correct, then why doesn't wake_up_process() include 
> > > this memory barrier automatically, the way set_current_state() does?  
> > > There could be an alternate version (__wake_up_process()) which omits 
> > > the barrier, just like __set_current_state().
> > 
> > A common case uses locking, in which case additional memory barriers
> > inside of the wait/wakeup functions are not needed.  Any accesses made
> > while holding the lock before invoking the wakeup function (e.g.,
> > wake_up()) are guaranteed to be seen after acquiring that same
> > lock following return from the wait function (e.g., wait_event()).
> > In this case, adding barriers to the wait and wakeup functions would
> > just add overhead.
> > 
> > But yes, this decision does mean that people using the wait/wakeup
> > functions without locking need to be more careful.  Something like
> > this:
> > 
> > /* prior accesses. */
> > smp_mb();
> > wakeup_flag = 1;
> > wake_up(...);
> > 
> > And on the other task:
> > 
> > wait_event(... wakeup_flag == 1 ...);
> > smp_mb();
> > /* The waker's prior accesses will be visible here. */
> > 
> > Or am I missing your point?
> 
> I'm afraid so.  The code doesn't use wait_event(), in part because
> there's no wait_queue (since only one task is involved).

Ah, got it.

The required pattern should be very similar, however.

> But maybe there's another barrier which needs to be fixed.  Felipe, can
> you check to see if received_cbw() is getting called in
> get_next_command(), and if so, what value it returns?  Or is the
> preceding sleep_thread() the one that never wakes up?
> 
> It could be that the smp_wmb() in wakeup_thread() needs to be smp_mb().  
> The reason being that get_next_command() runs outside the protection of 
> the spinlock.

This sounds very likely to me.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Memory barrier needed with wake_up_process()?

2016-09-02 Thread Paul E. McKenney
On Fri, Sep 02, 2016 at 02:10:13PM -0400, Alan Stern wrote:
> Paul, Peter, and Ingo:
> 
> This must have come up before, but I don't know what was decided.
> 
> Isn't it often true that a memory barrier is needed before a call to 
> wake_up_process()?  A typical scenario might look like this:
> 
>   CPU 0
>   -
>   for (;;) {
>   set_current_state(TASK_INTERRUPTIBLE);
>   if (signal_pending(current))
>   break;
>   if (wakeup_flag)
>   break;
>   schedule();
>   }
>   __set_current_state(TASK_RUNNING);
>   wakeup_flag = 0;
> 
> 
>   CPU 1
>   -
>   wakeup_flag = 1;
>   wake_up_process(my_task);
> 
> The underlying pattern is:
> 
>   CPU 0   CPU 1
>   -   -
>   write current->statewrite wakeup_flag
>   smp_mb();
>   read wakeup_flagread my_task->state
> 
> where set_current_state() does the write to current->state and 
> automatically adds the smp_mb(), and wake_up_process() reads 
> my_task->state to see whether the task needs to be woken up.
> 
> The kerneldoc for wake_up_process() says that it has no implied memory
> barrier if it doesn't actually wake anything up.  And even when it
> does, the implied barrier is only smp_wmb, not smp_mb.
> 
> This is the so-called SB (Store Buffer) pattern, which is well known to
> require a full smp_mb on both sides.  Since wake_up_process() doesn't
> include smp_mb(), isn't it correct that the caller must add it
> explicitly?
> 
> In other words, shouldn't the code for CPU 1 really be:
> 
>   wakeup_flag = 1;
>   smp_mb();
>   wake_up_process(task);
> 
> If my reasoning is correct, then why doesn't wake_up_process() include 
> this memory barrier automatically, the way set_current_state() does?  
> There could be an alternate version (__wake_up_process()) which omits 
> the barrier, just like __set_current_state().

A common case uses locking, in which case additional memory barriers
inside of the wait/wakeup functions are not needed.  Any accesses made
while holding the lock before invoking the wakeup function (e.g.,
wake_up()) are guaranteed to be seen after acquiring that same
lock following return from the wait function (e.g., wait_event()).
In this case, adding barriers to the wait and wakeup functions would
just add overhead.

But yes, this decision does mean that people using the wait/wakeup
functions without locking need to be more careful.  Something like
this:

/* prior accesses. */
smp_mb();
wakeup_flag = 1;
wake_up(...);

And on the other task:

wait_event(... wakeup_flag == 1 ...);
smp_mb();
/* The waker's prior accesses will be visible here. */

Or am I missing your point?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESUBMIT] [PATCH] Replace mentions of "list_struct" to "list_head"

2014-11-14 Thread Paul E. McKenney
On Fri, Nov 14, 2014 at 05:09:55AM +0400, Andrey Utkin wrote:
> There's no such thing as "list_struct".
> 
> Signed-off-by: Andrey Utkin 

May as well get group rates on the acks...  ;-)

Acked-by: Paul E. McKenney 

> ---
>  drivers/gpu/drm/radeon/mkregtable.c  | 24 
>  drivers/media/pci/cx18/cx18-driver.h |  2 +-
>  include/linux/list.h | 34 +-
>  include/linux/plist.h| 10 +-
>  include/linux/rculist.h  |  8 
>  scripts/kconfig/list.h   |  6 +++---
>  tools/usb/usbip/libsrc/list.h|  2 +-
>  7 files changed, 43 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/mkregtable.c 
> b/drivers/gpu/drm/radeon/mkregtable.c
> index 4a85bb6..b928c17 100644
> --- a/drivers/gpu/drm/radeon/mkregtable.c
> +++ b/drivers/gpu/drm/radeon/mkregtable.c
> @@ -347,7 +347,7 @@ static inline void list_splice_tail_init(struct list_head 
> *list,
>   * list_entry - get the struct for this entry
>   * @ptr: the &struct list_head pointer.
>   * @type:the type of the struct this is embedded in.
> - * @member:  the name of the list_struct within the struct.
> + * @member:  the name of the list_head within the struct.
>   */
>  #define list_entry(ptr, type, member) \
>   container_of(ptr, type, member)
> @@ -356,7 +356,7 @@ static inline void list_splice_tail_init(struct list_head 
> *list,
>   * list_first_entry - get the first element from a list
>   * @ptr: the list head to take the element from.
>   * @type:the type of the struct this is embedded in.
> - * @member:  the name of the list_struct within the struct.
> + * @member:  the name of the list_head within the struct.
>   *
>   * Note, that list is expected to be not empty.
>   */
> @@ -406,7 +406,7 @@ static inline void list_splice_tail_init(struct list_head 
> *list,
>   * list_for_each_entry   -   iterate over list of given type
>   * @pos: the type * to use as a loop cursor.
>   * @head:the head for your list.
> - * @member:  the name of the list_struct within the struct.
> + * @member:  the name of the list_head within the struct.
>   */
>  #define list_for_each_entry(pos, head, member)   
> \
>   for (pos = list_entry((head)->next, typeof(*pos), member);  \
> @@ -417,7 +417,7 @@ static inline void list_splice_tail_init(struct list_head 
> *list,
>   * list_for_each_entry_reverse - iterate backwards over list of given type.
>   * @pos: the type * to use as a loop cursor.
>   * @head:the head for your list.
> - * @member:  the name of the list_struct within the struct.
> + * @member:  the name of the list_head within the struct.
>   */
>  #define list_for_each_entry_reverse(pos, head, member)   
> \
>   for (pos = list_entry((head)->prev, typeof(*pos), member);  \
> @@ -428,7 +428,7 @@ static inline void list_splice_tail_init(struct list_head 
> *list,
>   * list_prepare_entry - prepare a pos entry for use in 
> list_for_each_entry_continue()
>   * @pos: the type * to use as a start point
>   * @head:the head of the list
> - * @member:  the name of the list_struct within the struct.
> + * @member:  the name of the list_head within the struct.
>   *
>   * Prepares a pos entry for use as a start point in 
> list_for_each_entry_continue().
>   */
> @@ -439,7 +439,7 @@ static inline void list_splice_tail_init(struct list_head 
> *list,
>   * list_for_each_entry_continue - continue iteration over list of given type
>   * @pos: the type * to use as a loop cursor.
>   * @head:the head for your list.
> - * @member:  the name of the list_struct within the struct.
> + * @member:  the name of the list_head within the struct.
>   *
>   * Continue to iterate over list of given type, continuing after
>   * the current position.
> @@ -453,7 +453,7 @@ static inline void list_splice_tail_init(struct list_head 
> *list,
>   * list_for_each_entry_continue_reverse - iterate backwards from the given 
> point
>   * @pos: the type * to use as a loop cursor.
>   * @head:the head for your list.
> - * @member:  the name of the list_struct within the struct.
> + * @member:  the name of the list_head within the struct.
>   *
>   * Start to iterate over list of given type backwards, continuing after
>   * the current position.
> @@ -467,7 +467,7 @@ static inline void list_splice_tail_init(struct list_head 
> *list,
>   * list_for_each_entry_from - iterate over list of given type from the 
> current point
>   * @pos: the type * to use as a loop cursor.
>   * @head:the head for your list.
> - 

Re: RCU bug with v3.17-rc3 ?

2014-09-05 Thread Paul E. McKenney
On Thu, Sep 04, 2014 at 03:04:03PM -0500, Felipe Balbi wrote:
> Hi,
> 
> On Thu, Sep 04, 2014 at 02:25:35PM -0500, Felipe Balbi wrote:
> > On Thu, Sep 04, 2014 at 12:16:42PM -0700, Paul E. McKenney wrote:
> > > On Thu, Sep 04, 2014 at 01:40:21PM -0500, Felipe Balbi wrote:
> > > > Hi,
> > > > 
> > > > I keep triggering the following Oops with -rc3 when writing to the mass
> > > > storage gadget driver:
> > > 
> > > v3.17-rc3, correct?
> > 
> > yup, as in subject ;-)
> > 
> > > I take it that the test passes on some earlier version?
> > 
> > about to test v3.14.17.
> 
> coudln't get v3.14 working on this board but at least v3.16 is also
> affected except that on now it happened during boot, I didn't even need
> to run my test:
> 
> [   17.438195] Unable to handle kernel paging request at virtual address 
> 
> [   17.446109] pgd = ec36
> [   17.448947] [] *pgd=ae7f6821, *pte=, *ppte=
> [   17.455639] Internal error: Oops: 17 [#1] SMP ARM
> [   17.460578] Modules linked in: dwc3(+) udc_core lis3lv02d_i2c lis3lv02d 
> input_polldev dwc3_omap matrix_keypad
> [   17.471060] CPU: 0 PID: 1381 Comm: accounts-daemon Tainted: G W 
> 3.16.0-5-g8a6cdb4 #811
> [   17.480735] task: ed716040 ti: ec026000 task.ti: ec026000
> [   17.486405] PC is at find_get_entry+0x7c/0x128
> [   17.491070] LR is at 0xfffa
> [   17.494364] pc : []lr : []psr: a013
> [   17.494364] sp : ec027dc8  ip :   fp : ec027dfc
> [   17.506384] r10: c0c6f6bc  r9 : 0005  r8 : ecdf22f8
> [   17.511860] r7 : ec026008  r6 : 0001  r5 :   r4 : 
> [   17.518705] r3 : ec027db4  r2 :   r1 : 0005  r0 : 
> [   17.525526] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM Segment 
> user
> [   17.533007] Control: 10c5387d  Table: ac360059  DAC: 0015
> [   17.539020] Process accounts-daemon (pid: 1381, stack limit = 0xec026248)
> [   17.546151] Stack: (0xec027dc8 to 0xec028000)
> [   17.550710] 7dc0:     c0110ad0 ecdf0b80 
>  ecdf22f4
> [   17.559259] 7de0: ecdf22f4  0005  ec027e34 ec027e00 
> c0111874 c0110adc
> [   17.567824] 7e00: ecdf0b80 c03565b4 ed7165f8 ec3dddf0 ecdf22f4 0005 
> ec3ddd00 0001
> [   17.576385] 7e20: ecdf21a0  ec027ebc ec027e38 c0112978 c0111844 
>  c06af938
> [   17.584950] 7e40: ecdf0b70 ecdf0b70 ec027e6c ec027e58 0005 0006 
> 0b80 ecdf0b70
> [   17.593514] 7e60:  c0163264 ec3dddf0 ec027ee8 ec027ed4 0b80 
> ec027eac ec027e88
> [   17.602087] 7e80: c0178d98 c0356590   0002 5b80 
>  ec027f78
> [   17.610653] 7ea0: ec3ddd00 ed716040 b6cab018  ec027f44 ec027ec0 
> c0163264 c0112780
> [   17.619202] 7ec0: 0180 0180 ec027efc b6cab018 0180  
>  0180
> [   17.627772] 7ee0: ec027ecc 0001 ec3ddd00    
> ed716040 
> [   17.636371] 7f00:   5b80  0180  
>  
> [   17.644946] 7f20: b6cab018 ec3ddd00 b6cab018 ec027f78 ec3ddd00 0180 
> ec027f74 ec027f48
> [   17.653524] 7f40: c0163a6c c01631cc b6cab018  5b80  
> ec3ddd03 ec3ddd00
> [   17.662085] 7f60: 0180 b6cab018 ec027fa4 ec027f78 c0164198 c01639e0 
> 5b80 
> [   17.670658] 7f80: be91badc be91ba50 00044a00 0003 c000f044 ec026000 
>  ec027fa8
> [   17.679222] 7fa0: c000edc0 c0164158 be91badc be91ba50 0008 b6cab018 
> 0180 be91ba38
> [   17.687794] 7fc0: be91badc be91ba50 00044a00 0003 be91bbac b6cab008 
>  
> [   17.696370] 7fe0: 0020 be91ba40 b6c78e8c b6c78ea8 6010 0008 
> ae7f6821 ae7f6c21
> [   17.704956] [] (find_get_entry) from [] 
> (pagecache_get_page+0x3c/0x1f4)
> [   17.713687] [] (pagecache_get_page) from [] 
> (generic_file_read_iter+0x204/0x794)
> [   17.723259] [] (generic_file_read_iter) from [] 
> (new_sync_read+0xa4/0xcc)
> [   17.732185] [] (new_sync_read) from [] 
> (vfs_read+0x98/0x158)
> [   17.739945] [] (vfs_read) from [] (SyS_read+0x4c/0xa0)
> [   17.747149] [] (SyS_read) from [] 
> (ret_fast_syscall+0x0/0x48)
> [   17.754994] Code: e1a01009 eb08ffa9 e350 0a1f (e5904000) 
> [   17.761476] ---[ end trace 49c4ed35a1c01157 ]---
> 
> It seems to be a difficult-to-reproduce race though. On a second boot it
> didn't die during boot, but died with my USB test case. Unfortunately,
> the platform I'm using is pretty new and only goes as far back as v3.16
> (which I had to backport 11 patches to get it to boot good enough for

Re: RCU bug with v3.17-rc3 ?

2014-09-04 Thread Paul E. McKenney
On Thu, Sep 04, 2014 at 01:40:21PM -0500, Felipe Balbi wrote:
> Hi,
> 
> I keep triggering the following Oops with -rc3 when writing to the mass
> storage gadget driver:

v3.17-rc3, correct?

I take it that the test passes on some earlier version?

Thanx, Paul

> | # modprobe g_mass_storage stall=0 removable=1 file=/dev/sda
> | [   44.883554] Number of LUNs=8
> | [   44.886709] Mass Storage Function, version: 2009/09/11
> | [   44.892303] LUN: removable file: (no medium)
> | [   44.896916] Number of LUNs=1
> | [   44.901198] LUN: removable file: /dev/sda
> | [   44.905410] Number of LUNs=1
> | [   44.917706] g_mass_storage gadget: Mass Storage Gadget, version: 
> 2009/09/11
> | [   44.925018] g_mass_storage gadget: userspace failed to provide 
> iSerialNumber
> | [   44.932489] g_mass_storage gadget: g_mass_storage ready
> | [   52.583773] g_mass_storage gadget: high-speed config #1: Linux 
> File-Backed Storage
> | # [   98.270585] Unable to handle kernel paging request at virtual address 
> 
> | [   98.278198] pgd = c0004000
> | [   98.281027] [] *pgd=ae7f6821, *pte=, *ppte=
> | [   98.287648] Internal error: Oops: 17 [#1] SMP ARM
> | [   98.292559] Modules linked in: g_mass_storage usb_f_mass_storage 
> libcomposite configfs usb_storage xhci_hcd dwc3 udc_core matrix_keypad 
> lis3lv02d_i2c dwc3_omap lis3lv02d input_polldev
> | [   98.309721] CPU: 0 PID: 1820 Comm: file-storage Not tainted 
> 3.17.0-rc3-00013-gc6b1a7d #806
> | [   98.318346] task: ec356040 ti: ec378000 task.ti: ec378000
> | [   98.324000] PC is at find_get_entry+0x7c/0x128
> | [   98.328640] LR is at 0xfffa
> | [   98.331912] pc : []lr : []psr: a013
> | [   98.331912] sp : ec379b50  ip :   fp : ec379b84
> | [   98.343888] r10: c0c81243  r9 : 0001  r8 : ea123d28
> | [   98.349352] r7 : ec378010  r6 : 0001  r5 :   r4 : 000f
> | [   98.356181] r3 : ec379b3c  r2 :   r1 : 0001  r0 : 
> | [   98.363006] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment 
> kernel
> | [   98.370646] Control: 10c5387d  Table: ac2b0059  DAC: 0015
> | [   98.376641] Process file-storage (pid: 1820, stack limit = 0xec378248)
> | [   98.383454] Stack: (0xec379b50 to 0xec37a000)
> | [   98.388003] 9b40:   
> c01138d0 c002aa3c
> | [   98.396560] 9b60: 000f  ea123d24 000200d0 0001 00d0 
> ec379bbc ec379b88
> | [   98.405100] 9b80: c0114360 c01138dc c1486a00 6013 ec379bc4 1400 
>  ea123d24
> | [   98.413635] 9ba0: 0c00 0400 ec378010 c06dea0c ec379bdc ec379bc0 
> c011478c c0114330
> | [   98.422183] 9bc0: 00d0 c00904f8 c1486a00 1400 ec379c04 ec379be0 
> c019cd68 c0114760
> | [   98.430732] 9be0: c0090808 c0090590 ec379c34 0001 0c00 ea123d24 
> ec379c2c ec379c08
> | [   98.439300] 9c00: c019ecbc c019cd44 0c00 0001 ec379c58 c019eb9c 
> 0c00 ec379d54
> | [   98.447860] 9c20: ec379c8c ec379c30 c0113f14 c019ec8c 0c00 0001 
> ec379c58 ec379c5c
> | [   98.456414] 9c40: ec378030 0001 ec250cc0  1400  
> c018195c c00acd08
> | [   98.464974] 9c60: 5408b05a 1000 ec250cc0  ec379d68 ea123d24 
> ec378010 
> | [   98.473533] 9c80: ec379cf4 ec379c90 c0115ed4 c0113e6c 0001  
> c019f2b0 c0090590
> | [   98.482071] 9ca0: ec379cc4 ec378010 c06c3df4 1000 ea123c64 c019f2b0 
> ec379d54 ec379cc8
> | [   98.490607] 9cc0: 1400  0001 ec379d68 ec379d54 ec379e30 
> ec250cc0 ec356040
> | [   98.499178] 9ce0: ed7ab800 ec30d800 ec379d3c ec379cf8 c019f2b0 c0115c8c 
> c06be3b8 c006dcec
> | [   98.507741] 9d00: ec1b0010 ec30d800 ec379d08 ec379d08 ec379d10 ec379d10 
> ec379d18 ec379d18
> | [   98.516288] 9d20: 1400  ec379e30 ec250cc0 ec379dc4 ec379d40 
> c016618c c019f284
> | [   98.524833] 9d40: 1000 c0317b78 ec379d7c ec394000 1000 0003 
>  1000
> | [   98.533385] 9d60: ec379d4c 0001 ec250cc0    
> ec356040 
> | [   98.541946] 9d80:   1400  1000  
>  
> | [   98.550482] 9da0: ec394000 ec250cc0 ec394000 ec379e30 1000 1000 
> ec379df4 ec379dc8
> | [   98.559023] 9dc0: c0166a3c c01660f4 0002 ec0ace20 1000 000e 
> ec0ace00 
> | [   98.567567] 9de0: 1000 ed7ab800 ec379e64 ec379df8 bf0bc3b4 c0166994 
> 006f 1000
> | [   98.576112] 9e00: bf0bc7a4 6013 e8156000 000e 3930343d  
> bf0bc7a4 ec0ace00
> | [   98.584660] 9e20: 2400  1400  1400  
> ec379e64 
> | [   98.593193] 9e40: ed36ddc0 ec378018 ec30d894 ec0ace00 ec30d800 ec30d840 
> ec379ed4 ec379e68
> | [   98.601754] 9e60: bf0bd1c8 bf0bc08c bf0bf6ec ec378010 c06c3df4 ec356040 
> 0001 
> | [   98.610305] 9e80: ec379eac ec379e90 c00906b0 c00904f8 ec30d894 ed36ddc

Re: Memory synchronization vs. interrupt handlers

2013-08-29 Thread Paul E. McKenney
On Wed, Aug 28, 2013 at 01:28:08PM -0700, H. Peter Anvin wrote:
> On 08/28/2013 12:16 PM, Alan Stern wrote:
> > Russell, Peter, and Ingo:
> > 
> > Can you folks enlighten us regarding this issue for some common 
> > architectures?
> 
> On x86, IRET is a serializing instruction; it guarantees hard
> serialization of absolutely everything.

So a second interrupt from this same device could not appear to happen
before the IRET, no matter what device and/or I/O bus?  Or is IRET
defined to synchronize all the way out to the whatever device is
generating the next interrupt?

> I would expect architectures that have weak memory ordering to put
> appropriate barriers in the IRQ entry/exit code.

Adding a few on CC.  Also restating the question as I understand it:

Suppose that a given device generates an interrupt on CPU 0,
but that before CPU 0's interrupt handler completes, this device
wants to generate a second interrupt on CPU 1.  This can happen
as soon as CPU 0's handler does an EOI or equivalent.

Can CPU 1's interrupt handler assume that all the in-memory effects
of CPU 0's interrupt handler will be visible, even if neither
interrupt handler uses locking or memory barriers?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Memory synchronization vs. interrupt handlers

2013-08-27 Thread Paul E. McKenney
On Mon, Aug 26, 2013 at 11:49:15AM -0400, Alan Stern wrote:
> David and Paul:
> 
> Here's a question that doesn't seem to be answered in 
> Documentation/memory-barriers.txt.  Are memory accesses within an 
> interrupt handler synchronized with respect to interrupts?
> 
> In more detail, suppose we have an interrupt handler that uses a memory
> variable A.  The device attached to the IRQ line sends two interrupt
> requests, and we get:
> 
>   CPU 0   CPU 1
>   -   -
>   Receive IRQ
>   Call the interrupt handler
>   Write A
>   Finish IRQ processing
> 
>   Receive IRQ
>   Call the interrupt handler
>   Read A
>   Finish IRQ processing
> 
> Is CPU 0's write to A guaranteed to be visible on CPU 1?  Given that 
> interrupts on an IRQ line are serialized, and that IRQ processing must 
> involve some amount of memory barriers, I would expect the answer to be 
> Yes.

I have no idea.  I would hope that it did, but a lot depends on how or
whether the end-of-interrupt processing is handled by the I/O hardware.

> Does the answer change if the IRQ line is shared?  I wouldn't expect 
> it to be.
> 
> Now, if the handler were bound to multiple IRQ (or MSI) lines, then
> there'd be no reason to expect this to work.  However, even in this
> case, it seems that as long as we restrict our attention to handler
> invocations in response to interrupt requests from one particular IRQ
> line, the answer should be Yes.  (For example, if device X on IRQ I and 
> device Y on IRQ J both used the same handler, a write to A in response 
> to an interrupt from device X should be visible the next time X sends
> an interrupt.)
> 
> Do you know the answers?

I believe that we need to ask the architecture maintainers.  And maybe
also someone who knows about the devices in question.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] USB: use percpu counter to count submitted URBs per device

2013-08-19 Thread Paul E. McKenney
On Mon, Aug 19, 2013 at 11:40:50PM +0800, Ming Lei wrote:
> On Mon, Aug 19, 2013 at 11:14 PM, Greg Kroah-Hartman
>  wrote:
> > On Mon, Aug 19, 2013 at 11:06:31PM +0800, Ming Lei wrote:
> >> On Mon, Aug 19, 2013 at 10:00 PM, Greg Kroah-Hartman
> >>  wrote:
> >> > On Mon, Aug 19, 2013 at 07:04:18PM +0800, Ming Lei wrote:
> >> >> Because usb_hcd_submit_urb is in the hotest path of usb core,
> >> >> so use percpu counter to count URB instead of using atomic variable
> >> >> because atomic operations are much slower than percpu operations.
> >> >>
> >> >> Cc: Oliver Neukum 
> >> >> Cc: Alan Stern 
> >> >> Signed-off-by: Ming Lei 
> >> >> ---
> >> >>  drivers/usb/core/hcd.c   |4 ++--
> >> >>  drivers/usb/core/sysfs.c |7 ++-
> >> >>  drivers/usb/core/usb.c   |9 -
> >> >>  drivers/usb/core/usb.h   |1 +
> >> >>  include/linux/usb.h  |2 +-
> >> >>  5 files changed, 18 insertions(+), 5 deletions(-)
> >> >
> >> > And this really speeds things up?  Exactly what does it?
> >> >
> >> > And it's not that atomic operations are "slower", it's just that the
> >>
> >> For SMP, atomic_inc/atomic_dec are much slower than percpu
> >> variable inc/dec, see 4.1(Why Isn’t Concurrent Count-ing Trivial?)
> >> of [1].
> >>
> >>  However, it is slower: on a Intel Core Duo laptop, it is about six
> >>  times slower than non-atomic increment when a single thread
> >>  is incrementing, and more than ten times slower if two threads
> >>  are incrementing.
> >>
> >> Considered that most of desktop & laptop are SMP now, and with
> >> USB3.0, the submitted URBs per second may reach tens of thousand
> >> or more, and we can remove the atomic inc/dec operations in the hot
> >> path, so why don't do it?
> >
> > Because you really didn't do it, there are lots of other atomic
> > operations on that same path.
> 
> Not lots in the path of usbcore.
> 
> >
> > And, thens of thousands of urbs should be trivial, did you measure this
> > to see if it changed anything?  I'm not taking patches like this that
> > are not quantifiable, sorry.
> 
> The number may be too trivial to measure, but I will try to test
> with perf.
> 
> >
> > The gating problem in USB right now is the hardware, it's the slowest
> > thing, not the kernel, from everything I have ever tested, or seen.
> 
> The problem may not speed up usb performance, but might decrease
> CPU utilization a bit, or cache miss.
> 
> >
> > Well, bad host controller silicon is also a problem (i.e. raspberry pi),
> > but there's not much we can do about braindead problems like that...
> >
> >> > barriers involved can be slower, depending on what else is happening.
> >> > If you look, you are already hitting atomic variables in the same path,
> >> > so how can this change speed anything up?
> >>
> >> No, no barriers are involved in atomic_inc/atomic_dec at all.
> >
> > None?  Hm, you might want to rethink that statement :)
> 
> Please see Documentation/memory-barriers.txt:
> 
> The following also do _not_ imply memory barriers, and so may require explicit
> memory barriers under some circumstances (smp_mb__before_atomic_dec() for
> instance):
> 
> atomic_add();
> atomic_sub();
> atomic_inc();
> atomic_dec();

You are both right, each in your own way.  Greg is correct that on x86
these operations do include memory barriers, even though Linux does not
require them to do so.  Ming is correct that Linux does not require it,
and that there are in fact non-x86 architectures in which these operations
do not include memory barriers.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html