Hi, Vincent:
        We tried my patch doesn't fix the issue totally, we still see the 
kernel panic.
        currently solution in our test is: Rvert "usb: gadget: ffs: Fix BUG 
when userland exits with submitted AIO transfers"
        and only add the below patch to use the in_interrupt() to avoid the 
wait_event_lock_irq() is called in interrupt context.

Subject: [PATCH] fix the wait_event_lock_irq run in interrupt context

the patch is to fix the below bug:
BUG: scheduling while atomic: SettingsProvide/3433/0x00000104
Preemption disabled at:
[<ffffffff81e00053>] __do_softirq+0x53/0x31b
CPU: 1 PID: 3433 Comm: SettingsProvide Tainted: P     U
ilt-2e5dc0ac-g3f662bf-dirty #8
Call Trace:
 <IRQ>
 dump_stack+0x70/0x9e
 ? __do_softirq+0x53/0x31b
 __schedule_bug+0x7f/0xd0
 __schedule+0x61d/0x860
 ? _raw_spin_unlock_irqrestore+0x28/0x50
 ? prepare_to_wait_event+0x87/0x170
 schedule+0x36/0x90
 dwc3_gadget_ep_dequeue+0x27a/0x2e0 [dwc3]
 ? finish_wait+0x90/0x90
 usb_ep_dequeue+0x23/0x90
 ffs_aio_cancel+0x4c/0x70
 kiocb_cancel+0x40/0x50
 free_ioctx_users+0x6e/0x100
 percpu_ref_switch_to_atomic_rcu+0x159/0x160
 rcu_process_callbacks+0x1a7/0x520
 __do_softirq+0x11a/0x31b
 irq_exit+0xb5/0xc0
 smp_apic_timer_interrupt+0x7c/0x160

the BUG is introduced form the patch:
commit: cf3113d89
usb: dwc3: gadget: properly increment dequeue pointer on ep_dequeue

Signed-off-by: he, bo <bo...@intel.com>
---
 drivers/usb/dwc3/gadget.c | 112 +++++++++++++++++++++++-----------------------
 1 file changed, 57 insertions(+), 55 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 7120678..386fd82 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1413,65 +1413,67 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
                        /* wait until it is processed */
                        dwc3_stop_active_transfer(dwc, dep->number, true);
 
-                       /*
-                        * If request was already started, this means we had to
-                        * stop the transfer. With that we also need to ignore
-                        * all TRBs used by the request, however TRBs can only
-                        * be modified after completion of END_TRANSFER
-                        * command. So what we do here is that we wait for
-                        * END_TRANSFER completion and only after that, we jump
-                        * over TRBs by clearing HWO and incrementing dequeue
-                        * pointer.
-                        *
-                        * Note that we have 2 possible types of transfers here:
-                        *
-                        * i) Linear buffer request
-                        * ii) SG-list based request
-                        *
-                        * SG-list based requests will have r->num_pending_sgs
-                        * set to a valid number (> 0). Linear requests,
-                        * normally use a single TRB.
-                        *
-                        * For each of these two cases, if r->unaligned flag is
-                        * set, one extra TRB has been used to align transfer
-                        * size to wMaxPacketSize.
-                        *
-                        * All of these cases need to be taken into
-                        * consideration so we don't mess up our TRB ring
-                        * pointers.
-                        */
-                       wait_event_lock_irq(dep->wait_end_transfer,
-                                       !(dep->flags & 
DWC3_EP_END_TRANSFER_PENDING),
-                                       dwc->lock);
-
-                       if (!r->trb)
-                               goto out0;
-
-                       if (r->num_pending_sgs) {
-                               struct dwc3_trb *trb;
-                               int i = 0;
-
-                               for (i = 0; i < r->num_pending_sgs; i++) {
-                                       trb = r->trb + i;
-                                       trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
-                                       dwc3_ep_inc_deq(dep);
-                               }
+                       if(likely(!in_interrupt())) {
+                               /*
+                                * If request was already started, this means 
we had to
+                                * stop the transfer. With that we also need to 
ignore
+                                * all TRBs used by the request, however TRBs 
can only
+                                * be modified after completion of END_TRANSFER
+                                * command. So what we do here is that we wait 
for
+                                * END_TRANSFER completion and only after that, 
we jump
+                                * over TRBs by clearing HWO and incrementing 
dequeue
+                                * pointer.
+                                *
+                                * Note that we have 2 possible types of 
transfers here:
+                                *
+                                * i) Linear buffer request
+                                * ii) SG-list based request
+                                *
+                                * SG-list based requests will have 
r->num_pending_sgs
+                                * set to a valid number (> 0). Linear requests,
+                                * normally use a single TRB.
+                                *
+                                * For each of these two cases, if r->unaligned 
flag is
+                                * set, one extra TRB has been used to align 
transfer
+                                * size to wMaxPacketSize.
+                                *
+                                * All of these cases need to be taken into
+                                * consideration so we don't mess up our TRB 
ring
+                                * pointers.
+                                */
+                               wait_event_lock_irq(dep->wait_end_transfer,
+                                               !(dep->flags & 
DWC3_EP_END_TRANSFER_PENDING),
+                                               dwc->lock);
+
+                               if (!r->trb)
+                                       goto out0;
+
+                               if (r->num_pending_sgs) {
+                                       struct dwc3_trb *trb;
+                                       int i = 0;
+
+                                       for (i = 0; i < r->num_pending_sgs; 
i++) {
+                                               trb = r->trb + i;
+                                               trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+                                               dwc3_ep_inc_deq(dep);
+                                       }
+
+                                       if (r->unaligned || r->zero) {
+                                               trb = r->trb + 
r->num_pending_sgs + 1;
+                                               trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+                                               dwc3_ep_inc_deq(dep);
+                                       }
+                               } else {
+                                       struct dwc3_trb *trb = r->trb;
 
-                               if (r->unaligned || r->zero) {
-                                       trb = r->trb + r->num_pending_sgs + 1;
                                        trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
                                        dwc3_ep_inc_deq(dep);
-                               }
-                       } else {
-                               struct dwc3_trb *trb = r->trb;
 
-                               trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
-                               dwc3_ep_inc_deq(dep);
-
-                               if (r->unaligned || r->zero) {
-                                       trb = r->trb + 1;
-                                       trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
-                                       dwc3_ep_inc_deq(dep);
+                                       if (r->unaligned || r->zero) {
+                                               trb = r->trb + 1;
+                                               trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+                                               dwc3_ep_inc_deq(dep);
+                                       }
                                }
                        }
                        goto out1;
-- 
2.7.4





-----Original Message-----
From: Vincent Pelletier <plr.vinc...@gmail.com> 
Sent: Tuesday, September 25, 2018 8:46 PM
To: He, Bo <bo...@intel.com>
Cc: Felipe Balbi <felipe.ba...@linux.intel.com>; Alan Stern 
<st...@rowland.harvard.edu>; Roger Quadros <rog...@ti.com>; Lars-Peter Clausen 
<l...@metafoo.de>; Sam Protsenko <semen.protse...@linaro.org>; 
linux-usb@vger.kernel.org; Praneeth Bajjuri <prane...@ti.com>; Bai, Jie A 
<jie.a....@intel.com>
Subject: Re: usb: gadget: ffs: Fix BUG when userland exits with submitted AIO 
transfers

Hello,

On Thu, 2 Aug 2018 14:23:51 +0000, Vincent Pelletier <plr.vinc...@gmail.com> 
wrote:
> On Thu, 2 Aug 2018 00:45:14 +0000, "He, Bo" <bo...@intel.com> wrote:
> > Your patch fix the issue BUG: scheduling while atomic:  
> 
> Yes, although from my understanding of Felipe's answer, the actual bug 
> is the "scheduling" part (sleeping in dwc3 UDC) rather than the 
> "atomic" part.
> 
> So my patch addresses, still if my understanding is correct, the wrong 
> half of the problem, and even introduced the regression you identified.
> Hence my uncertainty...

I notice that neither He's patch, nor a dwc3 change to prevent it from 
scheduling inside usb_ep_dequeue are in Linus' master. Please correct me if I 
missed something.

Just in case my previous emails were not clear:
- I have no objection to He's patch on its own (and I do not know the
  code nearly enough to provide a meaningful reviewed-by).
- I do not intend to work on making changes to dwc3 gadget to stop it
  from scheduling in usb_ep_dequeue in the foreseeable future.

So if there is no ongoing work on dwc3 cancel behaviour (Felipe ?), please do 
resume/carry on with reviewing and integrating He's patch.

It is only *if* dwc3 cancel stops scheduling that I believe my patch should be 
reverted (here is the hash as of Linus' master):
  commit d52e4d0c0c428bf2ba35074a7495cdb28e2efbae
  Author: Vincent Pelletier <plr.vinc...@gmail.com>
  Date:   Wed Jun 13 11:05:06 2018 +0000

      usb: gadget: ffs: Fix BUG when userland exits with submitted AIO transfers

      This bug happens only when the UDC needs to sleep during usb_ep_dequeue,
      as is the case for (at least) dwc3.
and, in my understanding, a consequence is that He's fix would not be needed 
anymore - the bug my patch introduced disappearing in the revert.

Regards,
--
Vincent Pelletier

Reply via email to