On Mon, 20 Aug 2018, Adrian Ambrożewicz wrote:

> Hello,
> 
> I'm consistently observing kernel warnings related to Mass Storage USB
> Gadget de-initialization flow. After investigation I believe that I've
> found root cause of these warnings, however I'm unable to estimate the
> impact. I would like to put the issue into discussion about possible
> side-effects.
> 
> My usage model is to emulate file system using f_mass_storage gadget.
> Whenever I pull the USB cable out I see warning related to
> "dwc2_hsotg_init_fifo" which is commented in the following way: "/*
> Reset fifo map if not correctly cleared during previous session */".
> 
> Assumption I've made were the following:
> 1) disconnect handler notifies all gadgets with call_gadget(.., disconnect)
> 2) gadgets are then responsible to clear the ep fifos with usb_ep_disable()
> 3) disconnect handler initializes the fifo maps to clear state
> 
> Unfortunately in my case the warning occurs every time when using
> f_mass_storage gadget. By comparison with f_hid gadget I've come up
> with conclusion that it's caused by race condition in the way the
> "disable" flow is implemented in mass storage.
> 
> "Correct" flow implemented by HID gadget is the following:
> * dwc2_hsotg_irq (USBRst)
> ** dwc2_hsotg_disconnect
> *** call_gadget(disconnect)
> **** hidg_disable
> ***** usb_ep_disable // fifos utilized by device are cleared in fifo_map
> ** dwc_hsotg_core_init_disconnected
> *** dwc2_hsotg_init_fifo // fifo_map is empty - no warning here
> 
> By comparison here is flow observed in Mass Storage gadget. Race
> condition is introduced by utilizing separate worker thread to handle
> events:
> THREAD #1
> * dwc2_hsotg_irq (USBRst)
> ** dwc2_hsotg_disconnect
> *** call_gadget(disconnect)
> **** fsg_disable()
> ***** raise_exception(FSG_STATE_CONFIG_CHANGE) // Race between Thread
> #1 and Thread #2 starts here.
> ** dwc_hsotg_core_init_disconnected
> *** dwc2_hsotg_init_fifo // fifo_map is object of the race between
> this function, and call to usb_ep_disable()
> 
> THREAD #2
> * handle_exception(FSG_STATE_CONFIG_CHANGE)
> ** do_set_interface(NULL)
> *** usb_ep_disable()

This seems like a bug in the dwc2_hsotg driver; it should include 
sufficient locking to handle races of this sort.

A gadget's disconnect handler is not required to disable all endpoints.  
In fact it can't disable endpoints at all, because the disconnect
handler may be called in interrupt context whereas usb_ep_disable()
must be called in process context.  As a result, usb_ep_disable() is 
expected to be called after the disconnect handler has returned.

> My questions are the following:
> - is this known issue?
> - is it risky? What are possible outcomes?
> - If this race condition is dangerous what would be proper fix ?
> Should fsg_disable() call be blocked until handle_exception() finishes
> the cleanup? I know that it's just a WA for "misaligned" Mass Storage
> gadget architecture, but are there alternatives?

I don't know the answers to these questions.  The dwc2 maintainer 
probably has a better understanding of the issues.

However, the f_mass_storage gadget architecture is not "misaligned".  
And given the current implementation of the composite framework, 
fsg_disable() cannot block.

Alan Stern

> I would really appreciate professional insight on that matter,
> 
> Regards,
> Adrian

Reply via email to