Re: Memory barrier needed with wake_up_process()?

Felipe Balbi Wed, 07 Sep 2016 03:13:48 -0700

Hi,

Alan Stern <[email protected]> writes:
> On Tue, 6 Sep 2016, Peter Zijlstra wrote:
>
>> On Tue, Sep 06, 2016 at 01:49:37PM +0200, Peter Zijlstra wrote:
>> > On Tue, Sep 06, 2016 at 02:43:39PM +0300, Felipe Balbi wrote:
>> 
>> > > My fear now, however, is that changing smp_[rw]mb() to smp_mb() just
>> > > adds extra overhead which makes the problem much, much less likely to
>> > > happen. Does that sound plausible to you?
>> > 
>> > I did consider that, but I've not sufficiently grokked the code to rule
>> > out actual fail. So let me stare at this a bit more.
>> 
>> OK, so I'm really not seeing it, we've got:
>> 
>> while (bh->state != FULL) {
>>         for (;;) {
>>                 set_current_state(INTERRUPTIBLE); /* MB after */
>>                 if (signal_pending(current))
>>                         return -EINTR;
>>                 if (common->thread_wakeup_needed)
>>                         break;
>>                 schedule(); /* MB */
>>         }
>>         __set_current_state(RUNNING);
>>         common->thread_wakeup_needed = 0;
>>         smp_rmb(); /* NOP */
>> }
>> 
>> 
>> VS.
>> 
>> 
>> spin_lock(&common->lock); /* MB */
>> bh->state = FULL;
>> smp_wmb(); /* NOP */
>> common->thread_wakeup_needed = 1;
>> wake_up_process(common->thread_task); /* MB before */
>> spin_unlock(&common->lock);
>> 
>> 
>> 
>> (the MB annotations specific to x86, not true in general)
>> 
>> 
>> If we observe thread_wakeup_needed, we must also observe bh->state.
>> 
>> And the sleep/wakeup ordering is also correct, we either see
>> thread_wakeup_needed and continue, or we see task->state == RUNNING
>> (from the wakeup) and NO-OP schedule(). The MB from set_current_statE()
>> then matches with the MB from wake_up_process() to ensure we must see
>> thead_wakeup_needed.
>> 
>> Or, we go sleep, and get woken up, at which point the same happens.
>> Since the waking CPU gets the task back on its RQ the happens-before
>> chain includes the waking CPUs state along with the state of the task
>> itself before it went to sleep.
>> 
>> At which point we're back where we started, once we see
>> thread_wakeup_needed we must then also see bh->state (and all state
>> prior to that on the waking CPU).
>> 
>> 
>> 
>> There's enough cruft in the while-sleep loop to force reload bh->state.
>> 
>> Load/store tearing cannot be a problem because all values are single
>> bytes (the variables are multi bytes, but all values used only affect
>> the LSB).
>> 
>> Colour me puzzled.
>
> Felipe, can you please try this patch on an unmodified tree?  If the 
> problem still occurs, what shows up in the kernel log?
>
> Alan Stern
>
>
>
> Index: usb-4.x/drivers/usb/gadget/function/f_mass_storage.c
> ===================================================================
> --- usb-4.x.orig/drivers/usb/gadget/function/f_mass_storage.c
> +++ usb-4.x/drivers/usb/gadget/function/f_mass_storage.c
> @@ -485,6 +485,8 @@ static void bulk_out_complete(struct usb
>       spin_lock(&common->lock);
>       bh->outreq_busy = 0;
>       bh->state = BUF_STATE_FULL;
> +     if (bh->bulk_out_intended_length == US_BULK_CB_WRAP_LEN)
> +             INFO(common, "compl: bh %p state %d\n", bh, bh->state);
>       wakeup_thread(common);
>       spin_unlock(&common->lock);
>  }
> @@ -2207,6 +2209,7 @@ static int get_next_command(struct fsg_c
>               rc = sleep_thread(common, true);
>               if (rc)
>                       return rc;
> +             INFO(common, "next: bh %p state %d\n", bh, bh->state);
>       }
>       smp_rmb();
>       rc = fsg_is_set(common) ? received_cbw(common->fsg, bh) : -EIO;


I've replace INFO() with trace_printk() (which is what I have been using
anyway):

diff --git a/drivers/usb/gadget/function/f_mass_storage.c 
b/drivers/usb/gadget/function/f_mass_storage.c
index 2505117e88e8..dbc6a380b38b 100644
--- a/drivers/usb/gadget/function/f_mass_storage.c
+++ b/drivers/usb/gadget/function/f_mass_storage.c
@@ -485,6 +485,8 @@ static void bulk_out_complete(struct usb_ep *ep, struct 
usb_request *req)
        spin_lock(&common->lock);
        bh->outreq_busy = 0;
        bh->state = BUF_STATE_FULL;
+       if (bh->bulk_out_intended_length == US_BULK_CB_WRAP_LEN)
+               trace_printk("compl: bh %p state %d\n", bh, bh->state);
        wakeup_thread(common);
        spin_unlock(&common->lock);
 }
@@ -2207,6 +2209,7 @@ static int get_next_command(struct fsg_common *common)
                rc = sleep_thread(common, true);
                if (rc)
                        return rc;
+               trace_printk("next: bh %p state %d\n", bh, bh->state);
        }
        smp_rmb();
        rc = fsg_is_set(common) ? received_cbw(common->fsg, bh) : -EIO;

But I can't reproduce as reliably as before. I'll keep the thing running
an infinite loop which will stop only when interrupts in UDC (dwc3 in
this case) stop increasing.

-- 
balbi

signature.asc
Description: PGP signature

Re: Memory barrier needed with wake_up_process()?

Reply via email to