On Wed, Nov 09, 2016 at 10:54:38AM -0700, Tony Lindgren wrote:
> * Johan Hovold <[email protected]> [161109 08:40]:
> > On Tue, Nov 08, 2016 at 06:26:07PM -0700, Tony Lindgren wrote:
> > > * Johan Hovold <[email protected]> [161108 12:03]:
> > > > On Tue, Nov 08, 2016 at 10:34:13AM -0700, Tony Lindgren wrote:
> > > > > * Johan Hovold <[email protected]> [161108 10:09]:
> > > > > > In fact, the dsps timer must also be cancelled on suspend, or you
> > > > > > could
> > > > > > end up calling dsps_check_status() while suspended (it is currently
> > > > > > not
> > > > > > cancelled until the parent device is suspended, which could be too
> > > > > > late).
> > > > >
> > > > > And then this should no longer be an issue either.
> > > >
> > > > It would still be an issue as a system-suspending device could already
> > > > have been runtime-resumed so that dsps_check_status() would be called
> > > > directly from the timer function.
> > >
> > > The glue layers should do pm_runtime_get_sync(musb->controller) which
> > > dsps glue already does. So that's the musb_core.c device instance. And
> > > looks like we have dsps_suspend() call del_timer_sync(&glue->timer)
> > > already. I think we're safe here.
> >
> > But the point is that the controller might be RPM_ACTIVE if the
> > controller was already runtime resumed when it is system suspended.
> >
> > Since this (and the previous) patch run the work directly from the timer
> > callback if active, it could end up accessing the controller after it
> > has been system suspended. Specifically, stopping the timer in the glue
> > (parent) suspend callback is too late to avoid this.
> >
> > pm_runtime_get_sync(musb->controller);
> > musb_runtime_resume()
> > musb_restore_context();
> >
> > ...
> >
> > musb_suspend()
> > musb_save_context();
> >
> > otg_timer()
> > pm_runtime_get();
> > if (pm_runtime_active(musb->controller))
> > dsps_check_status();
> > pm_runtime_put_autosuspend();
> >
> > dsps_suspend()
> > del_timer_sync();
> >
>
> OK so we need to return without doing anything from otg_timer() on
> pm_runtime_get() to avoid that.
I'm afraid that won't work as pm_runtime_get() would still succeed (i.e.
even after musb_suspend()).
See 6f3c77b040fc ("PM / Runtime: let rpm_resume() succeed if RPM_ACTIVE,
even when disabled, v2").
> In the long run it would be nice to make whatever optional state polling
> musb generic with just a glue layer callback.
Yes, and make sure to stop polling in musb_suspend(). Would it be
possible to use the enable and disable ops for this until then?
> > > +/*
> > > + * Called to run work if device is active or else queue the work to
> > > happen
> > > + * on resume. Caller must take musb->lock.
> >
> > Caller must also hold an RPM reference.
>
> Good point, will add.
>
> > > + if (musb->is_runtime_suspended) {
> > > + list_add_tail(&w->node, &musb->pending_list);
> > > + error = 0;
> > > + } else {
> > > + dev_err(musb->controller, "could not add resume work %p\n",
> > > + callback);
> > > + devm_kfree(musb->controller, w);
> > > + error = -EINPROGRESS;
> >
> > But this means you should be able to run the callback below, right? It
> > has to be run from somewhere so otherwise the caller needs to retry
> > instead.
>
> Well there's no longer need to run the callback at that point any longer
> and with that removed that should not be an issue.
>
> Anyways, musb->is_runtime_suspended is needed to protect anything from
> being queued between runtime resume having called musb_run_resume_work()
> and before pm_runtime_active() is true. At that point the caller could
> just wait for pm_runtime_active() to be set and run the code. But based
> on my tests that does not happen and queueing is faster than getting to
> the pm_runtime_active() state so we just print errors in those cases if
> we ever hit it later on.
But for a generic solution, this race could still be an issue. What
about musb_gadget_queue() for example? Would not failing to start I/O
if racing with resume be a problem?
> Sounds like the rest of your comments are no longer an issue, please
> let me know if that's not the case.
I think the barrier comment for the WARN_ON on musb_runtime_suspend()
still applies.
> 8< ---------------------------
> From tony Mon Sep 17 00:00:00 2001
> From: Tony Lindgren <[email protected]>
> Date: Wed, 2 Nov 2016 19:59:05 -0700
> Subject: [PATCH] usb: musb: Fix sleeping function called from invalid
> context for hdrc glue
>
> Commit 65b3f50ed6fa ("usb: musb: Add PM runtime support for MUSB DSPS
> glue layer") wrongly added a call for pm_runtime_get_sync to otg_timer
> that runs in softirq context. That causes a "BUG: sleeping function called
> from invalid context" every time when polling the cable status:
>
> [<c015ebb4>] (__might_sleep) from [<c0413d60>] (__pm_runtime_resume+0x9c/0xa0)
> [<c0413d60>] (__pm_runtime_resume) from [<c04d0bc4>] (otg_timer+0x3c/0x254)
> [<c04d0bc4>] (otg_timer) from [<c0191180>] (call_timer_fn+0xfc/0x41c)
> [<c0191180>] (call_timer_fn) from [<c01915c0>] (expire_timers+0x120/0x210)
> [<c01915c0>] (expire_timers) from [<c0191acc>] (run_timer_softirq+0xa4/0xdc)
> [<c0191acc>] (run_timer_softirq) from [<c010168c>] (__do_softirq+0x12c/0x594)
>
> I did not notice that as I did not have CONFIG_DEBUG_ATOMIC_SLEEP enabled.
> And looks like also musb_gadget_queue() suffers from the same problem.
>
> Let's fix the issue by using a list of delayed work then call it on
> resume. Note that we want to do this only when musb core and it's
> parent devices are awake as noted by Johan Hovold <[email protected]>.
>
> Later on we may be able to remove other delayed work in the musb driver
> and just do it from pending_resume_work. But this should be done only
> for delayed work that does not have other timing requirements beyond
> just being run on resume.
>
> Fixes: 65b3f50ed6fa ("usb: musb: Add PM runtime support for MUSB DSPS
> glue layer")
> Reported-by: Johan Hovold <[email protected]>
> Signed-off-by: Tony Lindgren <[email protected]>
> ---
> drivers/usb/musb/musb_core.c | 112
> +++++++++++++++++++++++++++++++++++++++--
> drivers/usb/musb/musb_core.h | 7 +++
> drivers/usb/musb/musb_dsps.c | 34 +++++++++----
> drivers/usb/musb/musb_gadget.c | 21 ++++++--
> 4 files changed, 157 insertions(+), 17 deletions(-)
> @@ -2622,14 +2714,18 @@ static int musb_runtime_suspend(struct device *dev)
> {
> struct musb *musb = dev_to_musb(dev);
>
> + WARN_ON(!list_empty(&musb->pending_list));
> musb_save_context(musb);
> + musb->is_runtime_suspended = 1;
Isn't the compiler free to reorder the store to is_runtime_suspended
here so that the WARN_ON could trigger when queuing is racing with
runtime suspend?
Removing the WARN_ON, or moving it under list_lock while updating
is_runtime_suspended should do.
> return 0;
> }
> +static void otg_timer(unsigned long _musb)
> +{
> + struct musb *musb = (void *)_musb;
> + struct device *dev = musb->controller;
> + unsigned long flags;
> + int err;
> +
> + err = pm_runtime_get(dev);
> + if (err < 0) {
> + dev_err(dev, "Poll could not pm_runtime_get: %i\n", err);
So as mentioned above, this would not be sufficient to detect that dev
has been (system) suspended.
> +
> + return;
> + }
> +
> + spin_lock_irqsave(&musb->lock, flags);
> + err = musb_queue_resume_work(musb, dsps_check_status, NULL);
> + if (err < 0)
> + dev_err(dev, "%s resume work: %i\n", __func__, err);
> + spin_unlock_irqrestore(&musb->lock, flags);
> pm_runtime_mark_last_busy(dev);
> pm_runtime_put_autosuspend(dev);
> }
> diff --git a/drivers/usb/musb/musb_gadget.c b/drivers/usb/musb/musb_gadget.c
> --- a/drivers/usb/musb/musb_gadget.c
> +++ b/drivers/usb/musb/musb_gadget.c
> @@ -1222,6 +1222,15 @@ void musb_ep_restart(struct musb *musb, struct
> musb_request *req)
> rxstate(musb, req);
> }
>
> +static int musb_ep_restart_resume_work(struct musb *musb, void *data)
> +{
> + struct musb_request *req = data;
> +
> + musb_ep_restart(musb, req);
> +
> + return 0;
> +}
> +
> static int musb_gadget_queue(struct usb_ep *ep, struct usb_request *req,
> gfp_t gfp_flags)
> {
> @@ -1255,7 +1264,7 @@ static int musb_gadget_queue(struct usb_ep *ep, struct
> usb_request *req,
>
> map_dma_buffer(request, musb, musb_ep);
>
> - pm_runtime_get_sync(musb->controller);
> + pm_runtime_get(musb->controller);
Add the missing error handling here too?
Johan
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html